perm filename V235.TEX[TEX,DEK]2 blob
sn#363576 filedate 1978-06-20 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00015 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00003 00002 \input acphdr % Section 3.5
C00004 00003 %folio 190 galley 13 (C) Addison-Wesley 1978 *
C00015 00004 %folio 194 galley 14 (C) Addison-Wesley 1978 *
C00023 00005 %folio 196 galley 1 (C) Addison-Wesley 1978 *
C00034 00006 %folio 198 galley 2 (C) Addison-Wesley 1978 *
C00045 00007 %folio 201 galley 3 (C) Addison-Wesley 1978 *
C00057 00008 %folio 204 galley 4 (C) Addison-Wesley 1978 *
C00071 00009 %folio 206 galley 5 (C) Addison-Wesley 1978 *
C00085 00010 %folio 209 galley 6 (C) Addison-Wesley 1978 *
C00098 00011 %folio 212 galley 7 (C) Addison-Wesley 1978 *
C00105 00012 %folio 214 galley 8 (C) Addison-Wesley 1978 *
C00119 00013 %folio 216 galley 9 (C) Addison-Wesley 1978 *
C00135 00014 %folio 220 galley 10 (C) Addison-Wesley 1978 *
C00146 00015 \vfill\eject
C00147 ENDMK
C⊗;
\input acphdr % Section 3.5
\runninglefthead{RANDOM NUMBERS} % chapter title
\titlepage\setcpage0
\vfill
\tenpoint
\ctrline{SECTION 3.5 of THE ART OF COMPUTER PROGRAMMING}
\ctrline{$\copyright$ 1978 Addison--Wesley Publishing Company, Inc.}
\vfill
\runningrighthead{WHAT IS A RANDOM SEQUENCE?}
section{3.5}
\eject
\setcpage 140
%folio 190 galley 13 (C) Addison-Wesley 1978 *
\sectionbegin{3.5. WHAT IS A RANDOM SEQUENCE?}
{\bf A. Introductory remarks.}\xskip We
have seen in this chapter how to generate sequences
$$\langle U↓n\rangle = U↓0,\ U↓1,\ U↓2,\ \ldots\eqno(1)$$
of real numbers in the range $0 ≤ U↓n < 1$, and
we have called them ``random'' sequences even though they are
completely deterministic in character. To justify this terminology,
we claimed that the numbers ``behave as if they are truly random.''
Such a statement may be satisfactory for practical purposes
(at the present time), but it sidesteps a very important philosophical
and theoretical question: Precisely what do we mean by ``random
behavior''? A quantitative definition is needed. It is undesirable
to talk about concepts that we do not really understand, especially
since many apparently paradoxical statements can be made about
random numbers.
The mathematical theory of probability and statistics carefully
sidesteps the question; it refrains from making absolute
statements, and instead expresses everything in terms of how
much {\sl probability} is to be attached to statements involving
random sequences of events. The axioms of probability theory
are set up so that abstract probabilities can be computed readily,
but nothing is said about what probability really signifies,
or how this concept can be applied meaningfully to the actual
world. In the book {\sl Probability, Statistics, and Truth} (New York: Macmillan,
1957), R. von Mises discusses this situation in detail, and
presents the view that a proper definition of probability depends
on obtaining a proper definition of a random sequence.
Let us paraphrase here some statements made by two of the many
authors who have commented on the subject.
\vskip 12pt plus 3pt minus 7pt
\hang{\sl D. H. Lehmer} (1951):
``A random sequence is a vague notion embodying the idea of
a sequence in which each term is unpredictable to the uninitiated
and whose digits pass a certain number of tests, traditional
with statisticians and depending somewhat on the uses to which
the sequence is to be put.''
\yyskip\hang{\sl J. N. Franklin} (1962): ``The
sequence (1) is random if it has every property that is shared
by all infinite sequences of independent samples of random variables
from the uniform distribution.''
\vskip 12pt plus 3pt minus 7pt
Franklin's statement essentially
generalizes Lehmer's to say that the sequence must satisfy {\sl
all} statistical tests. His definition is not completely precise,
and we will see later that a reasonable interpretation of his
statement leads us to conclude that there is no such thing as
a random sequence! Let us begin with Lehmer's less restrictive
statement and attempt to make {\sl it} precise. What we really want
is a relatively short list of mathematical properties, each
of which is satisfied by our intuitive notion of a random sequence;
furthermore, the list is to be complete enough so that we are
willing to agree that {\sl any} sequence satisfying these properties
is ``random.'' In this section, we will develop what seems to
be an adequate definition of randomness according to these criteria,
although many interesting questions remain to be answered.
Let $u$ and $v$ be real numbers, $0 ≤ u < v ≤ 1$. If $U$ is
a random variable which is uniformly distributed between 0 and
1, the probability that $u ≤ U < v$ is equal to $v - u$. For
example, the probability that ${1\over 3} ≤ U < {2\over 3}$
is ${1\over 3}$. How can we translate this property of the single
number $U$ into a property of the infinite sequence $U↓0$, $U↓1$,
$U↓2$, $\ldotss $? The obvious answer is to count how many times
$U↓n$ lies between $u$ and $v$, and the average number of times
should equal $v - u$. Our intuitive idea of probability is based
in this way on the frequency of occurrence.
More precisely,
let $\nu(n)$ be the number of values of $j$, $0 ≤ j < n$, such that
$u ≤ U↓j < v$; we want the ratio $\nu (n)/n$ to approach the
value $v - u$ as $n$ approaches infinity:
$$\lim↓{n→∞}\nu (n)/n = v - u.\eqno(2)$$
If this condition holds for all choices of $u$
and $v$, the sequence is said to be {\sl equidistributed.}
Let $S(n)$ be a statement about the integer $n$ and the sequence
$U↓1$, $U↓2$, $\ldotss $; for example, $S(n)$ might be the statement
considered above, namely ``$u ≤ U↓n < v$.'' We can generalize
the idea used in the preceding paragraph to define ``the probability
that $S(n)$ is true'' with respect to a particular infinite
sequence: Let $\nu (n)$ be the number of values of $j$, $0 ≤ j
< n$, such that $S(j)$ is true.
\thbegin Definition A. {\sl We say $\Pr\biglp
S(n)\bigrp = λ$, if $\lim↓{n→∞} \nu (n)/n = λ$.} (Read,
``The probability that $S(n)$ is true is $λ$, if the limit as
$n$ tends to infinity of $\nu (n)/n$ is $λ$.'')
\yyskip In terms of this notation, the sequence $U↓0$,
$U↓1$, $\ldots$ is equidistributed if and only if $\Pr(u ≤ U↓n <
v) = v - u$, for all real numbers $u$, $v$ with $0 ≤ u < v ≤ 1$.
A sequence may be equidistributed without being random. For
example, if $U↓0$, $U↓1$, $\ldots$ and $V↓0$, $V↓1$, $\ldots$ are equidistributed
sequences, it is not hard to show that the sequence
$$\textstyle W↓0, W↓1, W↓2, W↓3, \ldotss = {1\over 2}U↓0,\ {1\over 2}\! +\!
{1\over 2}V↓0,\ {1\over 2}U↓1,\ {1\over 2}\!+\!{1\over 2}V↓1,\ \ldots\eqno(3)$$
is also equidistributed, since the sequence ${1\over
2}U↓0$, ${1\over 2}U↓1$, $\ldots$ is equidistributed between 0 and
${1\over 2}$, while the alternate terms, ${1\over 2}\!+\!{1\over
2}V↓0$, ${1\over 2}\!+\!{1\over 2}V↓1$, $\ldotss $, are equidistributed
between ${1\over 2}$ and 1. In the sequence of $W$'s, a value
less than ${1\over 2}$ is always followed by a value greater
than or equal to ${1\over 2}$, and conversely; hence that sequence
is not random by any reasonable definition. A stronger property
than equidistribution is needed.
A natural generalization of the equidistribution property, which
removes the objection stated in the preceding paragraph, is
to consider adjacent pairs of numbers of our sequence. We can
require the sequence to satisfy the condition
$$\Pr(u↓1 ≤ U↓n < v↓1\quad\hjust{and}\quad u↓2 ≤ U↓{n+1} < v↓2)
= (v↓1 - u↓1)(v↓2 - u↓2)\eqno(4)$$
for any four numbers $u↓1$, $v↓1$, $u↓2$, $v↓2$ with
$0 ≤ u↓1 < v↓1 ≤ 1$, $0 ≤ u↓2 < v↓2 ≤ 1$. In general, for any
positive integer $k$ we can require our sequence to be $k${\sl-distributed}
in the following sense:
\thbegin Definition B. {\sl The sequence $(1)$ is said
to be $k$-distributed if
$$\Pr(u↓1 ≤ U↓n < v↓1, \ldotss, u↓k ≤U↓{n+k-1}<v↓k)=(v↓1-u↓1)\ldotsm
(v↓k-u↓k)\eqno(5)$$
for all choices of real numbers $u↓j$, $v↓j$, with $0≤u↓j<v↓j≤1$, for $1≤j≤k$.}
%folio 194 galley 14 (C) Addison-Wesley 1978 *
\yyskip An equidistributed sequence is a 1-distributed sequence.
Note that if $k > 1$, a $k$-distributed sequence is always $(k
- 1)$-distributed, since we may set $u↓k = 0$ and $v↓k = 1$
in Eq.\ (5). Thus, in particular, any sequence which is known
to be 4-distributed must also be 3-distributed, 2-distributed,
and equidistributed. We can investigate the largest $k$ for
which a given sequence is $k$-distributed; and this leads us
to formulate
\thbegin Definition C. {\sl A sequence is said to be $∞$-distributed
if it is $k$-distributed for all positive integers $k$.}
\yyskip So far we have considered ``$[\,0,1)$ sequences,''
i.e., sequences of real numbers lying between zero and one.
The same ideas apply to integer-valued sequences; let us say
a sequence $\langle X↓n\rangle = X↓0$, $X↓1$, $X↓2$, $\ldots$ is
a ``$b$-ary sequence'' if each $X↓n$ is one of the integers
0, 1, $\ldotss$, $b - 1$. Thus, a 2-ary (binary) sequence is a sequence
of zeros and ones.
A $k$-digit ``$b$-ary number'' $x↓1x↓2 \ldotsm x↓k$ is an ordered
set of $k$ integers, where $0 ≤ x↓j < b$ for $1 ≤ j ≤ k$.
\thbegin Definition D. {\sl A $b$-ary sequence is
said to be $k$-distributed if
$$\Pr(X↓nX↓{n+1} \ldotsm X↓{n+k-1} = x↓1x↓2 \ldotsm x↓k) = 1/b↑k\eqno(6)$$
for all $b$-ary numbers $x↓1x↓2 \ldotsm x↓k$.}
\yyskip It is clear from this definition that if
$U↓0$, $U↓1$, $\ldots$ is a $k$-distributed $[\,0, 1)$ sequence, then
the sequence $\lfloor bU↓0\rfloor$, $\lfloor bU↓1\rfloor$, $\ldots$
is a $k$-distributed $b$-ary
sequence. $\biglp$If we set $u↓j = x↓j/b$, $v↓j = (x↓j + 1)/b$, $X↓n
= \lfloor bU↓n\rfloor$, Eq.\ (5) becomes Eq.\ (6).$\bigrp$ Furthermore, every
$k$-distributed $b$-ary sequence, for $k > 1$, is also $(k -
1)$-distributed: we add together the probabilities for the $b$-ary
numbers $x↓1 \ldotsm x↓{k-1}0$, $x↓1 \ldotsm x↓{k-1}1$, $\ldotss$,
$x↓1 \ldotsm x↓{k-1}(b - 1)$ to obtain
$$\Pr(X↓n \ldotsm X↓{n+k-2} = x↓1 \ldotsm x↓{k-1}) = 1/b↑{k-1}.$$
(Probabilities for disjoint events are additive;
see exerc≠se 5.) It therefore is natural to speak of an $∞$-distributed
$b$-ary sequence, as in Definition C above.
The representation of a positive real number in the radix-$b$
number system may be regarded as a $b$-ary sequence; for example,
$π$ corresponds to the 10-ary sequence 3, 1, 4, 1, 5, 9, 2,
6, 5, 3, 5, 8, 9, $\ldotss\,$. It has been conjectured that this
sequence is $∞$-distributed, but nobody has yet been able to prove
that it is even 1-distributed.
Let us analyze these concepts a little more closely in the case
when $k$ equals a million. A binary sequence which is 1000000-distributed
is going to have runs of a million zeros in a row! Similarly,
a $[\,0,1)$ sequence which is 1000000-distributed is going to have
runs of a million consecutive values each of which is less than
${1\over 2}$. It is true that this will happen only $({1\over
2})↑{1000000}$ of the time, on the average, but the fact is
that it {\sl does} happen. Indeed, this phenomenon will occur in
any truly random sequence, using our intuitive notion of ``truly
random.'' One can easily imagine that such a situation will
have a drastic effect if this set of a million ``truly random''
numbers is being used in a computer-simulation experiment; there
would be good reason to complain about the random-number generator.
However, if we have a sequence of numbers which never has runs
of a million consecutive $U$'s less than ${1\over 2}$, the sequence
is not random, and it will not be a suitable source of numbers
for other conceivable applications which use extremely long
blocks of $U$'s as input. In summary, {\sl a truly random sequence
will exhibit local nonrandomness\/}; local nonrandomness
is necessary in some applications, but it is disastrous in others.
We are forced to conclude that no sequence of ``random''
numbers can be adequate for every application.
In a similar vein, one may argue that there is no way to judge
whether a {\sl finite} sequence is random or not; any particular
sequence is just as likely as any other one. These facts are
definitely stumbling blocks if we are ever to have a useful
definition of randomness, but they are not really cause for
alarm. It is still possible to give a definition for the randomness
of infinite sequences of real numbers in such a way that the
corresponding theory (viewed properly) will give us a great
deal of insight concerning the ordinary finite sequences of
rational numbers which are actually generated on a computer.
Furthermore, we shall see later in this section that there are
several plausible definitions of randomness for finite sequences.
\subsectionbegin{B. $∞$-distributed sequences}
Let us now undertake a brief study of the theory of sequences that are
$∞$-distributed. To describe the theory adequately, we will need to
use a bit of higher mathematics, so we assume in the remainder
of this subsection that the reader knows the material ordinarily
taught in an ``advanced calculus'' course.
%folio 196 galley 1 (C) Addison-Wesley 1978 *
First it is convenient to generalize Definition A, since
the limit appearing there does not exist for all sequences.
Let us define
\def\Pro{\mathop{\overline{\char'120\char'162}}}
\def\Pru{\mathop{\underline{\char'120\char'162}}}
$$\Pro\biglp S(n)\bigrp = \limsup↓{n→∞}
\biglp \nu (n)/n\bigrp,\qquad\Pru\biglp S(n)\bigrp = \liminf
↓{n→∞}\biglp \nu (n)/n\bigrp .\eqno (7)$$
Then $\Pr\biglp S(n)\bigrp $, if it exists, is
the common value of $\Pru\biglp S(n)\bigrp$ and $\Pro\biglp
S(n)\bigrp $.
\vskip 1pt
We have seen that a $k$-distributed $[\,0,1)$ sequence leads to
a $k$-distributed $b$-\penalty100 ary sequence, if $U$ is replaced by $\lfloor
bU\rfloor $. Our first theorem shows that a converse result
is true:
\thbegin Theorem A. {\sl Let $\langle U↓n\rangle
= U↓0$, $U↓1$, $U↓2$, $\ldots$ be a $[\,0,1)$ sequence. If the sequence
$$\langle \lfloor b↓jU↓n\rfloor \rangle = \lfloor b↓jU↓0\rfloor ,\
\lfloor b↓jU↓1\rfloor ,\ \lfloor b↓jU↓2\rfloor ,\ \ldots$$
is a $k$-distributed $b↓j$-ary sequence for all $b↓j$
in an infinite sequence of integers $1 < b↓1 < b↓2 < b↓3 < \cdotss$,
then the original sequence $\langle U↓n\rangle$ is $k$-distributed.}
\yyskip As an example of this theorem, suppose
that $b↓j = 2↑j$. The sequence $\lfloor 2↑jU↓0\rfloor$, $\lfloor
2↑jU↓1\rfloor$, $\ldots$ is essentially the sequence of the first
$j$ bits of the binary representations of $U↓0$, $U↓1$, $\ldotss\,$.
If all these integer sequences are $k$-distributed, in the sense
of Definition D, the real-valued sequence $U↓0$, $U↓1$, $\ldots$
must also be $k$-distributed in the sense of Definition B.
\yyskip\noindent{\sl Proof of Theorem A.}\xskip If the sequence
$\lfloor bU↓0\rfloor$, $\lfloor bU↓1\rfloor$, $\ldots$ is $k$-distributed,
it follows by the addition of probabilities that Eq.\ (5) holds
whenever each $u↓j$ and $v↓j$ is a rational number with denominator
$b$. Now let $u↓j$, $v↓j$ be any real numbers, and let $u↑\prime↓{j}$,
$v↑\prime↓j$ be rational numbers with denominator $b$ such that
$$u↑\prime↓j
≤ u↓j < u↑\prime↓{j} + 1/b,\qquad v↑\prime↓{j} ≤ v↓j < v↑\prime↓{j}
+ 1/b.$$
Let $S(n)$ be the statement that $u↓1 ≤ U↓n < v↓1$,
$\ldotss$, $u↓k ≤ U↓{n+k-1} < v↓k$. We have
$$\twoline{\Pro\biglp S(n)\bigrp ≤ \Pr\left(u↑\prime↓{1} ≤ U↓n
< v↑\prime↓{1} + {1\over b} ,\; \ldotss ,\; u↑\prime↓{k} ≤
U↓{n+k-1} < v↑\prime↓{k} + {1\over b}\right)}{1pt}{
=\left(v↓1↑\prime-u↓1↑\prime+{1\over b}\right)\ldotsm
\left(v↓k↑\prime-u↓k↑\prime+{1\over b}\right);}$$
\vskip-18pt
$$\twoline{\Pru\biglp S(n)\bigrp ≥ \Pr\left(u↑\prime↓{1}+{1\over b} ≤ U↓n
< v↑\prime↓{1} ,\; \ldotss ,\; u↑\prime↓{k}+{1\over b} ≤
U↓{n+k-1} < v↑\prime↓{k} \right)}{1pt}{
=\left(v↓1↑\prime-u↓1↑\prime-{1\over b}\right)\ldotsm
\left(v↓k↑\prime-u↓k↑\prime-{1\over b}\right).}$$
Now $|(v↑\prime↓{j} - u↑\prime↓{j} \pm
1/b) - (v↓j - u↓j)| ≤ 2/b$; since our inequalities hold for
all $b = b↓j$, and since $b↓j → ∞$ as $j → ∞$, we have
$$(v↓1 - u↓1) \ldotsm (v↓k - u↓k) ≤\Pru\biglp S(n)\bigrp ≤
\Pro\biglp S(n)\bigrp ≤ (v↓1 - u↓1) \ldotsm (v↓k - u↓k).\quad\blackslug$$
The next theorem is our main tool for proving
things about $k$-distributed sequences.
\thbegin Theorem B. {\sl Let $\langle U↓n\rangle$
be a $k$-distributed $[\,0,1)$ sequence, and let $f(x↓1, x↓2, \ldotss
, x↓k)$ be a Riemann-integrable function of $k$ variables;
then}
$$\eqalign{⊗\lim↓{n→∞} {1\over n} \sum ↓{0≤j<n} f(U↓j,
U↓{j+1}, \ldotss , U↓{j+k-1})\cr
⊗\qquad=\int ↑{1}↓{0} \cdotss \int ↑{1}↓{0} f(x↓1, x↓2,
\ldotss , x↓k)\, dx↓1 \ldotsm dx↓k.\cr}\eqno(8)$$
\vskip-4pt
\proofbegin The definition of a $k$-distributed
sequence states that this result is true in the special case
that
$$f(x↓1, \ldotss , x↓k) = \left\{\eqalign{1,⊗\qquad\hjust{if }
u↓1 ≤ x↓1 < v↓1, \ldotss , u↓k ≤ x↓k < v↓k;\cr 0,⊗\qquad\hjust{otherwise.}\cr}
\right.\eqno(9)$$
Therefore Eq.\ (8) is true whenever $f = a↓1f↓1
+ a↓2f↓2 + \cdots + a↓mf↓m$ and when each $f↓j$ is a function
of type (9); in other words, Eq.\ (8) holds whenever $f$ is a
``step-function'' obtained by (i) partitioning the unit $k$-dimensional
cube into subcells whose faces are parallel to the coordinate
axes, and (ii) assigning a constant value to $f$ on each subcell.
Now let $f$ be any Riemann-integrable function. If $\epsilon$
is any positive number, we know (by the definition of Riemann-integrability)
that there exist step functions $\underline f$ and $\overline f$ such that
$\underline f(x↓1, \ldotss , x↓k) ≤ f(x↓1, \ldotss , x↓k) ≤ \overline f(x↓1,
\ldotss , x↓k)$, and such that the difference of the integrals
of $\underline f$ and $\overline f$ is less than $\epsilon $. Since Eq.\
(8) holds for $\underline f$ and $\overline f$, and since
$$\eqalign{{1\over n} \sum ↓{0≤j<n}\underline f(U↓j, \ldotss , U↓{j+k-1})⊗≤ {1\over
n} \sum ↓{0≤j<n} f(U↓j, \ldotss , U↓{j+k-1})\cr
⊗≤ {1\over n} \sum ↓{0≤j<n} \overline f(U↓j, \ldotss
, U↓{j+k-1}),\cr}$$
we conclude that Eq.\ (8) is true also for $f$.\quad\blackslug
\yyskip Theorem B can be applied, for example, to
the {\sl permutation test} described in Section 3.3.2.
Let $(p↓1, p↓2, \ldotss , p↓k)$ be any permutation of the numbers
$\{1, 2, \ldotss , k\}$; we want to show that
$$\Pr(U↓{n+p↓1-1} < U↓{n+p↓2-1}<\cdots<U↓{n+p↓k-1}) = 1/k!.\eqno(10)$$
To prove this, assume that the sequence $\langle
U↓n\rangle$ is $k$-distributed, and let
$$f(x↓1, \ldotss , x↓k) = \left\{\eqalign{1,⊗\qquad\hjust{if }x↓{p↓1}<x↓{p↓2}<
\cdots<x↓{p↓k};\cr 0,⊗\qquad\hjust{otherwise.}\cr}\right.$$
We have
$$\eqalign{⊗\Pr(U↓{n+p↓1-1} < U↓{n+p↓2-1} < \cdots
< U↓{n+p↓k-1})\cr\noalign{\vskip 3pt}
⊗\qquad=\int↓0↑1\cdotss\int↓0↑1 f(x↓1,\ldotss,x↓k)\,dx↓1\ldotsm dx↓k\cr
⊗\qquad=\int↓0↑1\,dx↓{p↓k}\int↓0↑{x↓{p↓k}}\cdotss\int↓0↑{x↓{p↓3}}\,dx↓{p↓2}
\int↓0↑{x↓{p↓2}}\,dx↓{p↓1}={1\over k!}.\cr}$$
\vskip-6pt
\thbegin Corollary P. {\sl If a $[\,0,1)$ sequence
is $k$-distributed, it satisfies the permutation test of order
$k$, in the sense of Eq.\ $(10)$.}\quad\blackslug
\yyskip We can also show that the {\sl serial correlation
test} is satisfied:
\thbegin Corollary S. {\sl If a $[\,0,1)$ sequence is $(k
+ 1)$-distributed, the serial correlation coefficient between
$U↓n$ and $U↓{n+k}$ tends to zero:}
\def\\{{1\over n}\sum}
$$\lim↓{n→∞}{\\U↓jU↓{j+k}-\left(\\U↓j\right)\left(\\U↓{j+k}\right)\over
\sqrt{\biglp\\U↓{\!j}↑2-\biglp\\U↓j\bigrp↑2\bigrp\biglp\\U↓{\!j+k}↑2-\biglp\\U↓{j+k}
\bigrp↑2\bigrp}}=0.$$
(All summations here are for $0 ≤ j < n$.)
\proofbegin By Theorem B, the quantities
$$\textstyle\\U↓jU↓{j+k},\qquad\\U↓{\!j}↑2,\qquad\\U↓{\!j+k}↑2,
\qquad\\U↓j,\qquad\\U↓{j+k}$$
tend to the respective limits ${1\over 4}$,
${1\over 3}$, ${1\over 3}$, ${1\over 2}$, ${1\over 2}$ as $n
→ ∞$.\quad\blackslug
%folio 198 galley 2 (C) Addison-Wesley 1978 *
\yyskip Let us now consider some slightly more general distribution
properties of sequences. We have defined the notion of $k$-distribution
by considering all of the adjacent $k$-tuples; for example, a sequence
is 2-distributed if and only if the points
$$(U↓0, U↓1),\ (U↓1, U↓2),\ (U↓2, U↓3),\
(U↓3, U↓4),\ (U↓4, U↓5),\ \ldots$$
are equidistributed in the unit square. It is quite
possible, however, that this can happen while alternate pairs of points $(U↓1,
U↓2)$, $(U↓3, U↓4)$, $(U↓5, U↓6)$, $\ldots$ are {\sl not} equidistributed;
if the density of points $(U↓{2n-1}, U↓{2n})$ is deficient in
some area, the other points $(U↓{2n}, U↓{2n+1})$ might compensate.
For example, the periodic binary sequence
$$\langle X↓n\rangle = 0, 0, 0, 1,\hskip 5pt 0, 0, 0, 1,\hskip 5pt 1,
1, 0, 1,\hskip 5pt 1, 1, 0, 1,\hskip 5pt 0, 0, 0, 1,\hskip 5pt \ldotss ,\eqno(11)$$
with a period of length 16, is seen to be 3-distributed;
yet the sequence of even-numbered elements $\langle X↓{2n}\rangle
= 0$, 0, 0, 0, 1, 0, 1, 0, $\ldots$ has three times as many zeros
as ones, while the subsequence of odd-numbered elements $\langle
X↓{2n+1}\rangle = 0$, 1, 0, 1, 1, 1, 1, 1, $\ldots$ has three
times as many ones as zeros.
If a sequence $\langle U↓n\rangle$ is $∞$-distributed, example (11)
shows that it is not at all obvious that the subsequence
of alternate terms $\langle U↓{2n}\rangle = U↓0$, $U↓2$, $U↓4$, $U↓6$,
$\ldots$ will be $∞$-distributed or even 1-distributed. But we shall
see that $\langle U↓{2n}\rangle$ is, in fact, $∞$-distributed,
and much more is true.
\thbegin Definition E. {\sl A $[\,0,1)$ sequence
$\langle U↓n\rangle$ is said to be $(m, k)$-distributed if
$$\twoline{\Pr(u↓1 ≤ U↓{mn+j} < v↓1,\, u↓2 ≤ U↓{mn+j+1} < v↓2,\, \ldotss ,\,
u↓k ≤ U↓{mn+j+k-1} < v↓k)}{6pt}{
= (v↓1 - u↓1) \ldotsm (v↓k - u↓k)}$$
for all choices of real numbers $u↓r$, $v↓r$ with
$0 ≤ u↓r < v↓r ≤ 1$ for $1 ≤ r ≤ k$, and for all integers $j$ with
$0 ≤ j < m$.}
\yyskip\noindent Thus a $k$-distributed sequence is the special
case $m = 1$ in Definition E; the case $m = 2$ means that the
$k$-tuples starting in even positions must have the same density as the
$k$-tuples starting in odd positions, etc.
Several properties of Definition E are obvious:
$$\eqalignno{⊗\hjust{\sl An $(m, k)$-distributed sequence is
$(m,\kappa)$-distributed for $1 ≤ \kappa ≤ k$.}⊗(12)\cr
\noalign{\vskip 4pt}
⊗\hjust{\sl An $(m, k)$-distributed sequence is $(d, k)$-distributed
for all divisors $d$ of $m$.}⊗(13)\cr}$$
We can also define the concept of an $(m, k)$-distributed
$b$-ary sequence, as in Definition D; and the proof of Theorem
A remains valid for $(m, k)$-distributed sequences.
The next theorem, which is in many ways rather surprising, shows
that the property of being $∞$-distributed is very strong indeed,
much stronger than we imagined it to be when we first considered
the definition of the concept.
\thbegin Theorem C {\rm(Ivan Niven and H. S. Zuckerman)}.
{\sl An $∞$-distributed sequence is $(m, k)$-distributed for all positive
integers $m$ and $k$.}
\proofbegin It suffices to prove the theorem
for $b$-ary sequences, by using the generalization of Theorem
A just mentioned. Furthermore, we may assume that $m = k$: according to
(12) and (13), the sequence will be $(m, k)$-distributed
if it is $(mk, mk)$-distributed.
So we will prove that {\sl any $∞$-distributed $b$-ary sequence
$X↓0$, $X↓1$, $\ldots$ is $(m, m)$-distributed for all positive integers
$m$.} Our proof is a simplified version of the original one given by Niven
and Zuckerman in {\sl Pacific J. Math.\ \bf 1}
(1951), 103--109.
The key idea we shall use is an important technique that
applies to many situations in mathematics: ``If the sum of
$m$ quantities and the sum of their squares are both consistent
with the hypothesis that the $m$ quantities are equal, then that
hypothesis is true.'' In a strong form, this principle may be
stated as follows:
\thbegin Lemma E. {\sl Given $m$ sequences of numbers $\langle
y↓{jn}\rangle = y↓{j0}$, $y↓{j1}$, $y↓{j2}$, $\ldots$ for $1 ≤ j ≤ m$,
suppose that
$$\eqalign{\lim↓{n→∞}(y↓{1n} + y↓{2n} + \cdots
+ y↓{mn})⊗ = mα,\cr
\limsup↓{n→∞}(y↑{2}↓{1n} + y↑{2}↓{2n} + \cdots + y↑{2}↓{mn})
⊗≤ mα↑2.\cr}\eqno(14)$$
Then for each $j$, $\lim↓{n→∞}y↓{jn}$ exists and equals $α$.}
\yyskip\noindent An incredibly simple proof of this lemma
is given in exercise 9.\quad\blackslug
\yyskip Resuming our proof of Theorem C,
let $x = x↓1x↓2 \ldotsm x↓m$ be a $b$-ary number, and say that $x$
{\sl occurs} at position $p$ of the sequence if $X↓{p-m+1}X↓{p-m+2}
\ldotsm X↓p = x$. Let $\nu ↓j(n)$ be the number of occurrences
of $x$ at position $p$ when $p < n$ and $p \mod m = j$. Let
$y↓{jn} = \nu ↓j(n)/n$; we wish to prove that
$$\lim↓{n→∞} y↓{jn} = 1/mb↑m.\eqno (15)$$
First we know that
$$\lim↓{n→∞} (y↓{0n} + y↓{1n} + \cdots + y↓{(m-1)n})= 1/b↑m,\eqno (16)$$
since the sequence is $m$-distributed. By Lemma
E and Eq.\ (16), the theorem will be proved if we can show that
$$\limsup↓{n→∞} (y↑{2}↓{0n} + y↑{2}↓{1n} +
\cdots + y↑{2}↓{(m-1)n}) ≤ 1/mb↑{2m}.\eqno (17)$$
This inequality is not obvious yet; some
rather delicate maneuvering is necessary before we can prove
it. Let $q$ be a multiple of $m$, and consider
$$C(n) = \sum ↓{0≤j<m}{\nu ↓j(n) - \nu ↓j(n - q)\choose2}.\eqno (18)$$
This is the number of pairs of occurrences of $x$
in positions $p↓1, p↓2$ with $n - q ≤ p↓1 < p↓2 < n$ and with
$p↓2 - p↓1$ a multiple of $m$. Consider now the sum
$$S↓N = \sum ↓{1≤n≤N+q} C(n).\eqno (19)$$
Each pair of occurrences of $x$ in positions $p↓1,
p↓2$ with $p↓1 < p↓2 < p↓1 + q$, where $p↓2 - p↓1$ is a multiple of $m$
and $p↓1 ≤ N$, is counted $p↓1 + q - p↓2$ times in the total
$S↓N$ (namely, when $p↓2 < n ≤ p↓1 + q$); and the pairs of such
occurrences with $N < p↓1 < p↓2 < N + q$ are counted $N + q
- p↓2$ times.
Let $d↓t(n)$ be the number of pairs of occurrences of $x$ in
positions $p↓1, p↓2$ with $p↓1 + t = p↓2 < n$. The analysis
above shows that
$$\sum ↓{0<t<q/m} (q - mt) d↓{mt}(N + q) ≥ S↓N ≥ \sum ↓{0<t<q/m}
(q - mt) d↓{mt}(N).\eqno (20)$$
Since the original sequence is $q$-distributed,
$$\lim↓{N→∞}{1\over N} d↓{mt}(N) = 1/b↑{2m}\eqno(21)$$
for all $t$, $0 < t < q/m$, and therefore by (20) we have
$$\lim↓{N→∞}{S↓N\over N}=\sum↓{0<t<q/m}(q-mt)/b↑{2m}=q(q-m)/2mb↑{2m}.\eqno(22)$$
This fact will prove the theorem, after some manipulation.
%folio 201 galley 3 (C) Addison-Wesley 1978 *
\yskip By Definition,
$$2S↓N = \sum ↓{1≤n≤N+q}\,\, \sum ↓{0≤j<m} \biglp (\nu ↓j(n) -
\nu ↓j(n - q))↑2 - (\nu ↓j(n) - \nu ↓j(n - q))\bigrp ,$$
and we can remove the unsquared terms by applying
(16) to get
$$\lim↓{N→∞} {T↓N\over N} = q(q - m)/mb↑{2m}+ q/b↑m,\eqno (23)$$
where
$$T↓N = \sum ↓{1≤n≤N+q}\,\, \sum ↓{0≤j<m} \biglp \nu ↓j(n)
- \nu ↓j(n - q)\bigrp ↑2.$$
Using the inequality
$${1\over r}\bigglp\sum ↓{1≤j≤r}a↓j\biggrp↑2 ≤ \sum ↓{1≤j≤r} a↑{2}↓{j}$$
(cf.\ exercise 1.2.3--30), we find that
$$\eqalignno{⊗\limsup↓{N→∞} \sum ↓{0≤j<m} {1\over N(N +
q)} \bigglp\sum ↓{1≤n≤N+q}\biglp\nu ↓j(n) - \nu ↓j(n - q)\bigrp\biggrp↑2\cr
⊗\hskip 60mm ≤ q(q - m)/mb↑{2m} + q/b↑m.⊗(24)\cr}$$
We also have
$$q\nu ↓j(N) ≤ \sum ↓{N<n≤N+q}\nu ↓j(n) = \sum ↓{1≤n≤N+q}\biglp
\nu ↓j(n) - \nu ↓j(n - q)\bigrp ≤ q\nu ↓j(N + q),$$
and putting this into (24) gives
$$\limsup↓{N→∞} \sum ↓{0≤j<m}\biglp \nu ↓j(N)/N\bigrp
↑2 ≤ (q - m)/qmb↑{2m} + 1/qb↑m.\eqno (25)$$
This formula has been established whenever $q$ is a
multiple of $m$; and if we let $q → ∞$ we obtain (17), completing
the proof.
For a possibly simpler proof, see J. W. S. Cassels, {\sl Pacific
J. Math.\ \bf 2} (1952), 555--557.\quad\blackslug
\yyskip Exercises 29 and 30 illustrate the nontriviality
of this theorem, and they also demonstrate the fact
that a $q$-distributed sequence will have probabilities
deviating from the true $(m, m)$-distribution probabilities
by essentially $1/\sqrt{q}$ at most. $\biglp$Cf. (25).$\bigrp$
The full hypothesis of
$∞$-distribution is necessary for the proof of the theorem.
As a result of Theorem C, we can prove that an $∞$-distributed
sequence passes the serial test, the maximum-of-$t$ test, the collision test,
and the tests on subsequences mentioned in Section 3.3.2. It
is not hard to show that the gap test, the poker test, and the
run test are also satisfied (see exercises 12 through 14); the
coupon collector's test is considerably more difficult to deal
with, but it too is satisfied (see exercises 15 and 16).
\yskip The existence of $∞$-distributed sequences
of a rather simple type is guaranteed by the next theorem.
\thbegin Theorem F {\rm(J. Franklin)}. {\sl The
$[\,0,1)$ sequence $U↓0$, $U↓1$, $\ldotss$, with
$$U↓n\, =\, \theta ↑n \mod 1\eqno (26)$$
is $∞$-distributed for almost all real numbers $\theta
> 1$. That is, the set
$$\leftset \theta \relv \theta > 1\hjust{\rm\ and (26) is not $∞$-distributed}
\rightset$$
is of measure zero.}
\yyskip\noindent The proofs of this theorem and some generalizations
are given in Franklin's paper cited below.\quad\blackslug
\yyskip Franklin has shown that $\theta$ must be
a transcendental number for (26) to be $∞$-distributed. The powers
$(π↑n \mod 1)$ have been laboriously computed for $n ≤ 10000$,
using multiple-precision arithmetic, and the most significant
35 bits of each of these numbers, stored on a disk file, have
successfully been used as a source of uniform deviates. According
to Theorem F, the probability
that the powers $(π↑n \mod 1)$ are $∞$-distributed is equal to
1; yet because there are uncountably many real numbers, this
gives us no information as to whether the sequence is really
$∞$-distributed or not. It is a fairly safe bet that nobody in
our lifetimes will ever {\sl prove} that this particular sequence
is {\sl not} $∞$-distributed; but it might not be. Because of
these considerations, one may legitimately wonder if there is
any {\sl explicit} sequence that is $∞$-distributed; i.e., {\sl
is there an algorithm to compute real numbers $U↓n$ for all $n
≥ 0$, such that the sequence $\langle U↓n\rangle$ is $∞$-distributed?}
The answer is yes, as shown for example by D. E. Knuth in {\sl
BIT \bf 5} (1965), 246--250. The sequence constructed there
consists entirely of rational numbers; in fact, each number
$U↓n$ has a terminating representation in the binary number
system. Another construction of an explicit $∞$-distributed sequence,
somewhat more complicated than the sequence just cited, follows
from Theorem W below. See also N. M. Korobov, {\sl Izv.\ Akad.\
Nauk SSSR \bf 20} (1956), 649--660.
\yyskip\noindent{\bf C. Does $∞$-distributed $=$ random?}\xskip
In view of all the above theory about $∞$-distributed
sequences, we can be sure of one thing: the concept of an $∞$-distributed
sequence is an important one in mathematics. There is also a
good deal of evidence that the following statement is a valid
formulation of the intuitive idea of randomness:
\thbegin Definition R1. {\sl A $[\,0,1)$ sequence is defined to
be ``random'' if it is an $∞$-distributed sequence.}
\yyskip\noindent We have seen that sequences meeting this definition
will satisfy all the statistical tests of Section 3.3.2 and
many more.
Let us attempt to criticize this definition objectively. First
of all, is every ``truly random'' sequence $∞$-distributed? There
are uncountably many sequences $U↓0$, $U↓1$, $\ldots$ of real numbers
between zero and one. If a truly random number generator is
sampled to give values $U↓0$, $U↓1$, $\ldotss $, any of the possible
sequences may be considered equally likely, and some of the
sequences (indeed, uncountably many of them) are not even equidistributed.
On the other hand, using any reasonable definition of probability
on this space of all possible sequences leads us to conclude
that a random sequence is $∞$-distributed {\sl with probability
one.} We are therefore led to formalize Franklin's definition
of randomness (as given at the beginning of this section) in
the following way:
\thbegin Definition R2. {\sl A $[\,0,1)$ sequence $\langle U↓n\rangle$
is defined to be ``random'' if, whenever $P$ is a property such
that $P(\langle V↓n\rangle )$ holds with probability one for a
sequence $\langle V↓n\rangle$ of independent samples of random
variables from the uniform distribution, then $P(\langle U↓n\rangle
)$ is true.}
\yyskip Is it perhaps possible that Definition R1 is equivalent to Definition
R2? Let us try out some possible objections to Definition R1,
and see whether these criticisms are valid.
In the first place, Definition R1 deals only with limiting properties
of the sequence as $n → ∞$. There are $∞$-distributed sequences
in which the first million elements are all zero; should such
a sequence be considered random?
This objection is not very valid. If $ε$ is any positive number,
there is no reason why the first million elements of a sequence
should not all be less than $\epsilon $; as pointed out earlier
in this section, there is no legitimate way to say if a finite
sequence is random or not. With probability one, a truly random
sequence contains infinitely many runs of a million consecutive
elements less than $\epsilon $, so why can't this happen at
the beginning of the sequence?
On the other hand, consider Definition R2 and let $P$ be the property
that all elements of the sequence are distinct; $P$ is true with
probability one, so any sequence with a million zeros is not random
by {\sl this} criterion.
%folio 204 galley 4 (C) Addison-Wesley 1978 *
Now let $P$ be the property that {\sl no} element of the
sequence is equal to zero; again, $P$ is true with probability
one, so by Definition R2 any sequence with a zero element is
nonrandom. More generally, however, let $x↓0$ be any fixed
number between zero and one, and let $P$ be the property that
no element of the sequence is equal to $x↓0$; Definition R2
now says that no random sequence may contain the element $x↓0$!
We can now prove that {\sl no sequence satisfies the condition
of Definition R2.} (For if $U↓0$, $U↓1$, $\ldots$ is such a sequence,
take $x↓0 = U↓0.)$
Therefore if R1 is too weak a definition, R2 is certainly too
strong. The ``right'' definition must be less strict than R2.
We have not really shown that R1 is too weak, however, so let
us continue to attack it some more. As mentioned above, an $∞$-distributed
sequence of {\sl rational} numbers has been constructed. (Indeed,
this is not so surprising: see exercise 18.) Almost all real
numbers are irrational; perhaps we should insist that
$$\Pr(U↓n\hjust{ is rational}) = 0$$
for a random sequence.
Note that the definition of equidistribution says that $\Pr(u
≤ U↓n < v) = v - u$. There is an obvious way to generalize this
definition, using measure theory: ``If $S \subset [\,0,1)$ is
a set of measure $\mu $, then
$$Pr(U↓n \in S) = \mu ,\eqno (27)$$
for all random sequences $\langle U↓n\rangle $.''
In particular, if $S$ is the set of rationals, it has measure
zero, so no sequence of rational numbers is equidistributed
in this generalized sense. It is reasonable to expect that Theorem
B could be extended to Lebesgue integration instead of Riemann
integration, if property (27) is stipulated. However, once again
we find that definition (27) is too strict, for {\sl no} sequence
satisfies that property. If $U↓0$, $U↓1$, $\ldots$ is any sequence,
the set $S = \{U↓0, U↓1, \ldotss\}$ is of measure zero, yet $\Pr(U↓n
\in S) = 1$. Thus, by the force of the same argument we used to
exclude rationals from random sequences, we can exclude all
random sequences.
So far Definition R1 has proved to be defensible. There are,
however, some quite valid objections to it. For example, if
we have a random sequence in the intuitive sense, the infinite
subsequence
$$U↓0,\ U↓1,\ U↓4,\ U↓9,\ \ldotss,\ U↓{n↑2},\ \ldots\eqno (28)$$
should also be a random sequence. This is not always
true for an $∞$-distributed sequence. In fact, if we take any
$∞$-distributed sequence and set $U↓{n↑2} ← 0$ for all $n$,
the counts $\nu ↓k(n)$ which appear in the test of $k$-distributivity
are changed by at most $\sqrt{n}$, so the limits of the ratios
$\nu ↓k(n)/n$ remain unchanged. Therefore R1 definitely fails
to satisfy this randomness criterion.
Perhaps we should strengthen R1 as follows:
\thbegin Definition R3. {\sl A $[\,0,1)$ sequence is said to be
``random'' if each of its infinite subsequences is $∞$-distributed.}
\yyskip\noindent Once again, however, the definition turns out to be too
strict. If $\langle U↓n\rangle$ is any equidistributed sequence,
it has a monotonic subsequence with $U↓{s↓0}< U↓{s↓1}
< U↓{s↓2} < \cdotss$.
\penalty-300 % good place to break a page
The secret is to restrict the subsequences so that they could
be defined by a man who does not look at $U↓n$ before he decides
whether or not it is to be in the subsequence. The following
definition now suggests itself:
\thbegin Definition R4. {\sl A $[\,0,1)$ sequence $\langle
U↓n\rangle$ is said to be ``random'' if, for every effective
algorithm that specifies an infinite sequence of distinct nonnegative
integers $s↓n$ for $n ≥ 0$, the subsequence $U↓{s↓0}$, $U↓{s↓1}$,
$\ldots$ corresponding to this algorithm is $∞$-distributed.}
\yyskip The algorithms referred to in this definition
are effective procedures that compute $s↓n$, given $n$. (See
Section 1.1.) Thus, for example, the sequence $\langle π↑n
\mod 1\rangle$ will {\sl not} satisfy R4, since it is either not
equidistributed or there is an effective algorithm that determines
an infinite subsequence $s↓n$ with $(π↑{s↓0}\mod 1) <
(π↑{s↓1}\mod 1) < \cdotss$. Similarly, {\sl no explicitly
defined sequence can satisfy Definition R4\/}; this is appropriate,
if we agree that no explicitly defined sequence can really be
random. It is quite likely, however, that the sequence $(\theta
↑n \mod 1\rangle$ will satisfy Definition R4, for almost all
real numbers $\theta > 1$; this is no contradiction, since almost
all $\theta$ are uncomputable by algorithms. The following facts
are known, for example:\xskip (i) The sequence $\langle \theta ↑n
\mod 1\rangle$ satisfies Definition R4 for almost all real $\theta
> 1$, if ``$∞$-distributed'' is replaced by ``1-distributed.''
This theorem was proved by J. F. Koksma, {\sl Compositio Mathematica}
{\bf 2} (1935), 250--258.\xskip (ii) The particular sequence $\langle
\theta ↑{s(n)} \mod 1\rangle$ is $∞$-distributed for almost all
real $\theta > 1$, if $\langle s(n)\rangle$ is a sequence of
integers for which $s(n + 1) - s(n) → ∞$ as $n → ∞$. For example,
we could have $s(n) = n↑2$, or $s(n) = \lfloor n \lg n\rfloor$.
Definition R4 is much stronger than Definition R1; but it is
still reasonable to claim that Definition R4 is too weak. For
example, let $\langle U↓n\rangle$ be a truly random sequence,
and define the subsequence $\langle U↓{s↓n}\rangle$ by
the following rules: $s↓0 = 0$, and (for $n > 0$) $s↓n$ is the
smallest integer $≥n$ for which $U↓{s↓n-1}$, $U↓{s↓n
-2}$, $\ldotss$, $U↓{s↓n-n}$ are all less than ${1\over
2}$. Thus we are considering the subsequence of values following
the first consecutive run of $n$ values less than ${1\over 2}$.
Suppose that ``$U↓n < {1\over 2}$'' corresponds to the value
``heads'' in the flipping of a coin. Gamblers tend to feel that
a long run of ``heads'' makes the opposite condition, ``tails,''
more probable, assuming that a true coin is being used; and the subsequence
$\langle U↓{s↓n}\rangle$ just defined corresponds to a
gambling system for a man who places his $n$th bet on the coin
toss following the first run of $n$ consecutive ``heads.'' The
gambler may think that $\Pr(U↓{s↓n} ≥ {1\over 2})$ is more
than ${1\over 2}$, but of course in a truly random sequence
$\langle U↓{s↓n}\rangle$ will be completely random. No
gambling system will ever able to beat the odds! Definition
R4 says nothing about subsequences formed according to such
a gambling system, so apparently we need something more.
Let us define a ``subsequence rule'' $\Rscr$ as an infinite sequence
of functions $\langle f↓n(x↓1, \ldotss , x↓n)\rangle$ where,
for $n ≥ 0$, $f↓n$ is a function of $n$ variables, and the value
of $f↓n(x↓1, \ldotss , x↓n)$ is either 0 or 1. Here $x↓1$, $\ldotss
$, $x↓n$ are elements of some set $S$. (Thus, in particular, $f↓0$
is a constant function, either 0 or 1.) A subsequence rule $\Rscr$
defines a subsequence of any infinite sequence $\langle X↓n\rangle$
of elements of $S$ as follows: {\sl The $n$th term $X↓n$ is
in the subsequence $\langle X↓n\rangle \Rscr$ if and only if $f↓n(X↓0,
X↓1, \ldotss , X↓{n-1}) = 1$.} Note that the subsequence $\langle
X↓n\rangle \Rscr$ thus defined is not necessarily infinite, and
it may in fact be the null subsequence.
For example, the ``gambler's subsequence'' just described corresponds
to the following subsequence rule: ``$f↓0 = 1$; and for $n >
0$, $f↓n(x↓1, \ldotss , x↓n) = 1$ if and only if there is some
$k$, $0 < k ≤ n$, such that $x↓m < {1\over 2}$, $x↓{m-1} < {1\over
2}$, $\ldotss$, $x↓{m-k+1} < {1\over 2}$, when $m = n$ but not when
$k ≤ m < n$.''
A subsequence rule $\Rscr$ is said to be {\sl computable} if there
is an effective algorithm that determines the value of $f↓n(x↓1,
\ldotss , x↓n)$, when $n$ and $x↓1$, $\ldotss$, $x↓n$ are given as input.
In a definition of randomness, we should restrict ourselves
to computable subsequence rules, or else we obtain a definition
like R3 above, which is too strong. But effective algorithms cannot
deal nicely with arbitrary real numbers as inputs; for example,
if a real number $x$ is specified by an infinite radix-10 expansion,
there is no algorithm to determine if $x$ is $<{1\over 3}$ or
not, since all digits of the number $0.333\,\ldots$ have to
be examined. Therefore computable subsequence rules do not apply
to all $[\,0,1)$ sequences, and it is convenient to base our next
definition on $b$-ary sequences:
\thbegin Definition R5. {\sl A $b$-ary sequence is said to be
``random'' if every infinite sub\-sequence defined by a computable
subsequence rule is $1$-distributed.
\yskip A $[\,0,1)$ sequence $\langle U↓n\rangle$ is said to be ``random''
if the $b$-ary sequence $\langle \lfloor bU↓n\rfloor \rangle$ is
``random'' for all integers $b ≥ 2$.}
%folio 206 galley 5 (C) Addison-Wesley 1978 *
\yyskip Note that Definition R5 says only ``1-distributed,''
not ``$∞$-distributed.'' It is interesting to verify that this may
be done without loss of generality. For we may define an obviously
computable subsequence rule $\Rscr(a↓1 \ldotsm a↓k)$ as follows,
given any $b$-ary number $a↓1 \ldotsm a↓k$: Let $f↓n(x↓1, \ldotss
, x↓n) = 1$ if and only if $n ≥ k - 1$ and $x↓{n-k+1} = a↓1$,
$\ldotss$, $x↓{n-1} = a↓{k-1}$, $x↓n = a↓k$. Now if $\langle X↓n\rangle$
is a $k$-distributed $b$-ary sequence, this rule $\Rscr(a↓1 \ldotsm
a↓k)$---which selects the subsequence consisting of those terms
just following an occurrence of $a↓1 \ldotsm a↓k$---defines an
infinite subsequence; and if this subsequence is 1-distributed,
each of the $(k + 1)$-tuples $a↓1 \ldotsm a↓ka↓{k+1}$ for $0 ≤
a↓{k+1} < b$ occurs with probability $1/b↑{k+1}$ in $\langle
X↓n\rangle $. Thus we can prove that a sequence satisfying
Definition R5 is $k$-distributed for all $k$, by induction on
$k$. Similarly, by considering the ``composition'' of subsequence
rules---if $\Rscr↓1$ defines an infinite subsequence $\langle X↓n\rangle
\Rscr↓1$, then we can define $\Rscr↓1\Rscr↓2$ to be the subsequence
rule for which $\langle X↓n\rangle \Rscr↓1\Rscr↓2 = (\langle X↓n\rangle
\Rscr↓1)\Rscr↓2$---we find that all subsequences considered in Definition
R5 are $∞$-distributed. (See exercise 32.)
The fact that $∞$-distribution comes out of Definition R5 as a
very special case is encouraging, and it is a good indication
that we may at last have found the definition of randomness
we have been seeking. But alas, there still is a problem.
It is not clear that sequences satisfying Definition R5 must
satisfy Definition R4. The ``computable subsequence rules'' we
have just specified always enumerate subsequences $\langle X↓{s↓n}
\rangle$ for which $s↓0 < s↓1 < \cdotss$, but $\langle s↓n\rangle$
does not have to be monotone in R4; it must only satisfy the
condition $s↓n ≠ s↓m$ for $n ≠ m$.
To meet this objection, we may combine Definitions R4 and R5
as follows:
\thbegin Definition R6. {\sl A $b$-ary sequence $\langle X↓n\rangle$
is said to be ``random'' if, for every effective algorithm that
specifies an infinite sequence of distinct nonnegative integers
$\langle s↓n\rangle$ as a function of $n$ and the values of $X↓{s↓0}$,
$\ldotss$, $X↓{s↓{n-1}}$, the subsequence
$\langle X↓{s↓n}\rangle$ corresponding to this algorithm
is ``random'' in the sense of Definition R5.
\yskip A $[\,0,1)$ sequence $\langle U↓n\rangle$ is said to be ``random''
if the $b$-ary sequence $\langle \lfloor bU↓n\rfloor \rangle$ is
``random'' for all integers $b ≥ 2$.}
\yyskip The author contends\footnote*{At least, he made such
a contention when originally preparing the material for this section in 1966.}
that this definition
surely meets all reasonable phil\-o\-soph\-i\-cal requirements for randomness,
so it provides an answer to the principal question posed
in this section.
\subsectionbegin{D. Existence of random sequences}
We have seen that Definition R3 is too strong, in the sense
that no sequences could exist satisfying that definition; and
the formulation of Definitions R4, R5, and R6 above was carried
out in an attempt to recapture the essential characteristics
of Definition R3. In order to show that Definition R6 is not
overly restrictive, it is still necessary for us to prove that
sequences satisfying all these conditions exist. Intuitively,
we feel quite sure that such sequences exist, because we believe
that a truly random sequence exists and satisfies Definition
R6; but a proof is really necessary to show that the definition
is consistent.
An interesting method for constructing sequences satisfying
Definition R5 has been given by A. Wald. The construction starts
with a very simple 1-distributed sequence:
\thbegin Lemma T. {\sl Let the sequence of real numbers
$\langle V↓n\rangle$ be defined in terms of the binary system
as follows:
$$\eqalignno{⊗\hfill V↓0=0,\qquad V↓1=.1,\qquad V↓2=.01,\qquad V↓3=.11,\qquad
V↓4=.001,\qquad\ldots\cr\noalign{\vskip 3pt}
⊗\hfill V↓n=.c↓r\ldotsm c↓11\qquad\hjust{if }n=2↑r+c↓12↑{r-1}+\cdots+c↓r.⊗(29)\cr}$$
Let $I↓{b↓1\ldotsm b↓r}$ denote the set of all
real numbers in $[\,0,1)$ whose binary representation begins with
$0.b↓1 \ldotsm b↓r$; thus
$$I↓{b↓1\ldotsm b↓r} = [(0.b↓1 \ldotsm b↓r)↓2,\, (0.b↓1 \ldotsm b↓r)↓2 +
2↑{-r}).\eqno(30)$$
Then if $\nu (n)$ denotes the number of $V↓k$ in $I↓{b↓1\ldotsm b↓r}$
for $0 ≤ k < n$, we have}
$$|\nu (n)/n \,-\, 2↑{-r}| ≤ 1/n.\eqno (31)$$
\proofbegin Since $\nu (n)$ is the number of
$k$ for which $k \mod 2↑r = (b↓r \ldotsm b↓1)↓2$, we have $\nu (n)
= t$ or $t + 1$ when $\lfloor n/2↑r\rfloor = t$. Hence $|\nu
(n) - n/2↑r| ≤ 1$.\quad\blackslug
\yyskip It follows from (31) that the sequence
$\langle \lfloor 2↑rV↓n\rfloor \rangle$ is an equidistributed
$2↑r$-ary sequence; hence by Theorem A, $\langle V↓n\rangle$
is an equidistributed $[\,0,1)$ sequence. Indeed, it is pretty
clear that $\langle V↓n\rangle$ is about as equidistributed
as a $[\,0,1)$ sequence can be. (For further discussion of this
and related sequences, see J. C. van der Corput, {\sl Proc.\
Koninklijke Nederl.\ Akad.\ Wetenschappen \bf 38} (1935),
813--821, 1058--1066; J. H. Halton, {\sl Numerische Math.\
\bf 2} (1960), 84--90, 196; L. H. Ramshaw, to appear.)
Now let $\Rscr↓1$, $\Rscr↓2$, $\ldots$ be infinitely many subsequence
rules; we would like to find a sequence $\langle U↓n\rangle$
for which all the infinite subsequences $\langle U↓n\rangle
\Rscr↓j$ are equidistributed.
\algbegin Algorithm W (Wald sequence).
Given an infinite sequence of subsequence rules $\Rscr↓1$, $\Rscr↓2$, $\ldotss$,
which define subsequences of $[\,0,1)$ sequences of rational
numbers, this procedure defines a $[\,0,1)$ sequence $\langle U↓n\rangle
$. The computation involves infinitely many auxiliary variables
$C[a↓1, \ldotss , a↓r]$ where $r ≥ 1$ and where $a↓j = 0$ or
1 for $1 ≤ j ≤ r$. These variables are initially all zero.
\algstep W1. [Initialize $n$.] Set $n ← 0$.
\algstep W2. [Initialize $r$.] Set $r ← 1$.
\algstep W3. [Test $\Rscr↓r$.] If the element $U↓n$ is to be in
the subsequence defined by $\Rscr↓r$, based on the values of $U↓k$
for $0 ≤ k < n$, set $a↓r ← 1$; otherwise set $a↓r ← 0$.
\algstep W4. [$B{[a↓1, \ldotss , a↓r]}$ full?] If $C[a↓1, \ldotss
, a↓r] < 3 \cdot 4↑{r-1}$, go to W6.
\algstep W5. [Increase $r$.] Set $r ← r + 1$ and return
to W3.
\algstep W6. [Set $U↓n$.] Increase $C[a↓1, \ldotss , a↓r]$ by
1 and let $k$ be the new value of $C[a↓1, \ldotss , a↓r]$. Set
$U↓n ← V↓k$, where $V↓k$ is defined in Lemma T above.
\algstep W7. [Advance $n$.] Increase $n$ by 1 and return
to W2.\quad\blackslug
\yyskip Strictly speaking, this is not an
algorithm, since it doesn't terminate; it would of course be
easy to modify the procedure to stop when $n$ reaches a given
value. The reader will find it easier to grasp the idea of the
construction if he tries it by hand, replacing the number $3
\cdot 4↑{r-1}$ of step W4 by $2↑r$ during this experiment.
Algorithm W is not meant to be a practical source of random
numbers; it is intended to serve only a theoretical purpose:
\thbegin Theorem W. {\sl Let $\langle U↓n\rangle$ be the sequence of rational
numbers defined by Algorithm W, and let $k$ be a positive integer.
If the subsequence $\langle U↓n\rangle \Rscr↓k$ is infinite, it
is $1$-distributed.}
\proofbegin Let $A[a↓1, \ldotss , a↓r]$ denote
the (possibly empty) subsequence of $\langle U↓n\rangle$ containing
precisely those elements $U↓n$ which, for all $j ≤ r$,
belong to subsequence $\langle U↓n\rangle \Rscr↓j$ if $a↓j = 1$
and do not belong to subsequence $\langle U↓n\rangle \Rscr↓j$
if $a↓j = 0$.
It suffices to prove, for all $r ≥ 1$ and all pairs of binary
numbers $a↓1 \ldotsm a↓r$ and $b↓1 \ldotsm b↓r$, that $\Pr(U↓n
\in I↓{b↓1\ldotsm b↓r})= 2↑{-r}$ with respect to the
subsequence $A[a↓1, \ldotss , a↓r]$, whenever the latter is infinite.
$\biglp$See Eq.\ (30).$\bigrp$ For if $r ≥ k$, the infinite sequence $\langle
U↓n\rangle \Rscr↓k$ is the finite union of the disjoint subsequences
$A[a↓1, \ldotss , a↓r]$ for $a↓k = 1$ and $a↓j = 0$ or 1 for
$1 ≤ j ≤ r$, $j ≠ k$; and it follows that $\Pr(U↓n \in I↓{b↓1\ldotsm
b↓r})= 2↑{-r}$ with respect to $\langle U↓n\rangle
\Rscr↓k$. (See exercise 33.) This is enough to show that the sequence
is 1-distributed, by Theorem A.
%folio 209 galley 6 (C) Addison-Wesley 1978 *
Let $B[a↓1, \ldotss , a↓r]$ denote the subsequence of
$\langle U↓n\rangle$ which consists of the values for those
$n$ in which $C[a↓1, \ldotss , a↓r]$ is increased by one in step
W6 of the algorithm. By the algorithm, $B[a↓1, \ldotss , a↓r]$
is a finite sequence with at most $3 \cdot 4↑{r-1}$ elements.
All but a finite number of the members of $A[a↓1, \ldotss , a↓r]$
come from the subsequences $B[a↓1, \ldotss , a↓r, \ldotss , a↓t]$,
where $a↓j = 0$ or 1 for $r < j ≤ t$.
Now assume that $A[a↓1, \ldotss , a↓r]$ is infinite, and let
$A[a↓1, \ldotss , a↓r] = \langle U↓{s↓n}\rangle$ where $s↓0
< s↓1 < s↓2 ≤ \cdotss$. If $N$ is a large integer, with $4↑r
≤ 4↑q < N ≤ 4↑{q+1}$, it follows that the number of values of
$k < N$ for which $U↓{s↓k}$ is in $I↓{b↓1\ldotsm b↓r}$
is (except for finitely many elements at the beginning of
the subsequence)
$$\nu (N) = \nu (N↓1) + \cdots + \nu (N↓m).$$
Here $m$ is the number of subsequences $B[a↓1,
\ldotss , a↓t]$ listed above in which $U↓{s↓k}$ appears
for some $k < N$; $N↓j$ is the number of values of $k$ with $U↓{s↓k}$
in the corresponding subsequence; and $\nu (N↓j)$ is the
number of such values which are also in $I↓{b↓1\ldotsm b↓r}$.
Therefore by Lemma T,
$$\eqalign{|\nu (N) - 2↑{-r}N| ⊗ = |\nu (N↓1) - 2↑{-r}N↓1 + \cdots
+ \nu (N↓m) - 2↑{-r}N↓m| \cr\noalign{\vskip 3pt}
⊗≤ |\nu (N↓1) - 2↑{-r}N↓1|
+ \cdots + |\nu (N↓m) - 2↑{-r}N↓m|\cr\noalign{\vskip 3pt}
⊗≤ m ≤ 1 + 2 + 4 + \cdots + 2↑{q-r+1} < 2↑{q+1}.\cr}$$
The inequality on $m$ follows here from the fact that,
by our choice of $N$, $U↓{s↓N}$ is in $B[a↓1, \ldotss , a↓t]$
for some $t ≤ q + 1$.
Therefore
$$|\nu (N)/N - 2↑{-r}| ≤ 2↑{q+1}/N < 2/\sqrt{N}.\quad\blackslug$$
\yyskip To show finally that sequences satisfying
Definition R5 exist, we note first that if $\langle U↓n\rangle$
is a $[\,0,1)$ sequence of rational numbers and if $\Rscr$ is a computable
subsequence rule for a $b$-ary sequence, we can make $\Rscr$ into
a computable subsequence rule $\Rscr↑\prime$ for $\langle U↓n\rangle$
by letting $f↑\prime↓{n}(x↓1, \ldotss , x↓n)$ in $\Rscr↑\prime$
equal $f↓n(\lfloor bx↓1\rfloor , \ldotss , \lfloor bx↓n\rfloor
)$ in $\Rscr$. If the $[\,0,1)$ sequence $\langle U↓n\rangle \Rscr↑\prime$ is
equidistributed, so is the $b$-ary sequence $\langle \lfloor bU↓n\rfloor
\rangle \Rscr$. Now the set of all computable subsequence rules
for $b$-ary sequences, for all values of $b$, is countable (since
only countably many effective algorithms are possible), so they
may be listed in some sequence $\Rscr↓1$, $\Rscr↓2$, $\ldotss$; therefore
Algorithm W defines a $[\,0,1)$ sequence which is random in the
sense of Definition R5.
This brings us to a somewhat paradoxical situation. As we mentioned
earlier, no effective algorithm can define a sequence which
satisfies Definition R4, and for the same reason there is no
effective algorithm which defines a sequence satisfying Definition
R5. A proof of the existence of such random sequences is necessarily
nonconstructive; how then can Algorithm W construct such a sequence?
There is no contradiction here; we have merely stumbled on the
fact that the set of all effective algorithms cannot be enumerated
by an effective algorithm. In other words, there is no effective
algorithm to select the $j$th computable subsequence rule $\Rscr↓j$;
this happens because there is no effective algorithm to determine
if a computational method ever terminates. (We shall return
to this topic in Chapter 11.) Important large classes of algorithms
{\sl can} be systematically enumerated; thus, for example, Algorithm
W shows that is possible to construct, with an effective algorithm,
a sequence which satisfies Definition R5 if we restrict consideration
to subsequence rules which are ``primitive recursive.''
By modifying step W6 of Algorithm W, so that it sets $U↓n ←
V↓{k+t}$ instead of $V↓k$, where $t$ is any nonnegative integer
depending on $a↓1$, $\ldotss$, $a↓r$, we can show that there are {\sl
uncountably} many $[\,0,1)$ sequences satisfying Definition R5.
\yskip The following theorem shows still another way to prove the existence
of uncountably many random sequences, using a less direct argument
based on measure theory, even if the strong definition R6 is
used:
\thbegin Theorem M. {\sl Let the real number $x$, $0 ≤ x
< 1$, correspond to the binary sequence $\langle X↓n\rangle$ if
the binary representation of $x$ is $(0.X↓0X↓1 \ldotsm)↓2$. Under this
correspondence, almost all $x$ correspond to binary sequences
that are random in the sense of Definition R6.} (In other
words, the set of all real $x$ which correspond to a binary
sequence that is nonrandom by Definition R6 has measure zero.)
\proofbegin Let $\Sscr$ be an effective algorithm
that determines an infinite sequence of distinct nonnegative
integers $\langle s↓n\rangle $, where the choice of $s↓n$ depends
only on $n$ and $X↓{s↓k}$ for $0 ≤ k < n$; and let $\Rscr$
be a computable subsequence rule. Then any binary sequence $\langle
X↓n\rangle$ leads to a subsequence $\langle X↓{s↓n}\rangle
\Rscr$, and Definition R6 says this subsequence must either be
finite or 1-distributed. It suffices to prove that {\sl for
fixed $\Rscr$ and $\Sscr$ the set $N(\Rscr, \Sscr)$ of all real $x$ corresponding
to $\langle X↓n\rangle$, such that $\langle X↓{s↓n}\rangle
\Rscr$ is infinite and not $1$-distributed, has measure zero.}
For $x$ has a nonrandom binary representation if and only if
$x$ is in $\union N(\Rscr, \Sscr)$, taken over the countably
many choices of $\Rscr$ and $\Sscr$.
Therefore let $\Rscr$ and $\Sscr$ be fixed. Consider the set $T(a↓1a↓2
\ldotsm a↓r)$ defined for all binary numbers $a↓1a↓2 \ldotsm a↓r$
as the set of all $x$ corresponding to $\langle X↓n\rangle $,
such that $\langle X↓{s↓n}\rangle \Rscr$ has $≥r$ elements
whose first $r$ elements are respectively equal to $a↓1$, $a↓2$,
$\ldotss$, $a↓r$. Our first result is that
$$T(a↓1a↓2 \ldotsm a↓r)\hjust{ has measure }≤2↑{-r}.\eqno(32)$$
To prove this, we start by observing that $T(a↓1a↓2
\ldotsm a↓r)$ is a measurable set: Each element of $T(a↓1a↓2
\ldotsm a↓r)$ is a real number $x = (0.X↓0X↓1 \ldotsm)↓2$ for which
there exists an integer $m$ such that algorithm $\Sscr$ determines
distinct values $s↓0$, $s↓1$, $\ldotss$, $s↓m$, and rule $\Rscr$ determines
a subsequence of $X↓{s↓0}$, $X↓{s↓1}$, $\ldotss$, $X↓{s↓m}$
such that $X↓{s↓m}$ is the $r$th element of this subsequence.
The set of all real $y = (0.Y↓0Y↓1\ldotsm)↓2$ such that $Y↓{s↓k}=X↓{s↓k}$
for $0 ≤ k ≤ m$ also belongs to $T(a↓1a↓2 \ldotsm a↓r)$, and
this is a measurable set consisting of the finite union of dyadic
subintervals $I↓{b↓1\ldotsm b↓t}$. Since there are
only countably many such dyadic intervals, we see that $T(a↓1a↓2
\ldotsm a↓r)$ is a countable union of dyadic intervals, and it
is therefore measurable. Furthermore, this argument can be extended
to show that the measure of $T(a↓1 \ldotsm a↓{r-1}0)$ equals
the measure of $T(a↓1 \ldotsm a↓{r-1}1)$, since the latter is
a union of dyadic intervals obtained from the former by requiring
that $Y↓{s↓k}= X↓{s↓k}$ for $0 ≤ k < m$ and $Y↓{s↓m}
≠ X↓{s↓m}$. Now since
$$T(a↓1 \ldotsm a↓{r-1}0) ∪ T(a↓1 \ldotsm a↓{r-1}1) \subset T(a↓1
\ldotsm a↓{r-1}),$$
the measure of $T(a↓1a↓2 \ldotsm a↓r)$ is at most
one-half the measure of $T(a↓1 \ldotsm a↓{r-1})$. The inequality
(32) follows by induction on $r$.
Now that (32) has been established, the remainder of the proof
is essentially to show that the binary representations of almost
all real numbers are equidistributed. The next few paragraphs
constitute a rather long but not difficult proof of this fact,
and they serve to illustrate typical estimation techniques in
mathematical analysis.
%folio 212 galley 7 (C) Addison-Wesley 1978 *
For $0 < \epsilon < 1$, let $B(r, \epsilon )$ be $\union
T(a↓1 \ldotsm a↓r)$, where the union is taken over all binary
numbers $a↓1 \ldotsm a↓r$ for which the number $\nu (r)$ of zeros
among $a↓1 \ldotsm a↓r$ satisfies
$$|\nu (r) - {1\over 2}r| ≥ 1 + \epsilon r.$$
The number of such binary numbers is $C(r, \epsilon
) = \sum {r\choose k}$ summed over the values of $k$ with $|k
- {1\over 2}r| ≥ 1 + \epsilon r$. A suitable estimate for the
tail of the binomial distribution can be obtained by the following
standard trick: Let $x$ be any positive number less than 1,
and let $s = ({1\over 2} + \epsilon )r$. Then
$$\sum ↓{k≥s}{r\choose k} ≤ \sum ↓{k≥s}{r\choose k}x↑{s-k}
≤ \sum ↓{k≥0}{r\choose k}x↑{s-k} = x↑s\left(1 + {1\over
x}\right)↑r.$$
By elementary calculus, the minimum value of $x↑s(1
+ 1/x)↑r$ occurs when $x = r/s - 1 = (1 - 2\epsilon )/(1 + 2\epsilon
)$, and this value of $x$ yields
$$\sum ↓{k≥({1\over2}+\epsilon )r}{r\choose k} ≤ \left(2\over
(1 + 2\epsilon )↑{{1\over2}+\epsilon}(1 - 2\epsilon )↑{{1\over2}-\epsilon}\right)↑r.
$$ Now when $\epsilon < {1\over 2}$ we have
$$\textstyle({1\over 2} + \epsilon )\ln(1 + 2\epsilon ) + ({1\over 2}
- \epsilon )\ln(1 - 2\epsilon ) = {1\over 1 \cdot 2} (2\epsilon
)↑2 + {1\over 3 \cdot 4} (2\epsilon )↑4 + {1\over 5 \cdot 6}
(2\epsilon )↑6 + \cdots > 2\epsilon ↑2,$$
hence the denominator in our estimate is larger
than $e↑{-2\epsilon↑2}$. We have proved that
\vskip 4pt
$$\sum ↓{k≥({1\over2}+\epsilon )r}{r\choose k} < 2↑re↑{-2ε↑2r},\qquad
\hjust{for all $r > 0$ and $0 < ε< {1\over 2}$}.\eqno (33)$$
But $C(r, \epsilon )$ is twice this left-hand quantity,
hence by (32) and (33)
$$B(r, \epsilon )\hjust{ has measure }≤2↑{-r}C(r,
\epsilon ) < 2e↑{-2ε↑2r}.$$
The next step is to define
$$B*(r, \epsilon ) = B(r, \epsilon ) ∪ B(r + 1, \epsilon
) ∪ B(r + 2, \epsilon ) ∪ \cdotss.$$
The measure of $B*(r, \epsilon )$ is at
most $\sum ↓{k≥r}ke↑{-2ε↑2r}$, and this is the
remainder of a convergent series so
$$\lim↓{r→∞}\biglp\hjust{measure of }B*(r, \epsilon
)\bigrp = 0.\eqno (34)$$
Now if $x$ is a real number whose binary
expansion $(0.X↓0X↓1\ldotsm)↓2$ leads to an infinite sequence $\langle
X↓{s↓n}\rangle\Rscr$ that is not 1-distributed, and if $\nu
(r)$ denotes the number of zeros in the first $r$ elements of
the latter sequence, then
$$|\nu (r)/r - {1\over 2}| ≥ 2\epsilon ,$$
for some $\epsilon > 0$ and infinitely many $r$.
This means $x$ is in $B*(r, \epsilon )$ for all $r$.
So finally we find that
$$N(\Rscr,\Sscr) = \union↓{t≥2}\,\inter↓{r≥1} B*(r,1/t),$$
and, by (34), $\inter↓{r≥1} B*(r,1/t)$
has measure zero for all $t$. Hence $N(\Rscr,\Sscr)$ has
measure zero.\quad\blackslug
\yyskip From the existence of {\sl binary} sequences
satisfying Definition R6, we can show the existence of $[\,0,1)$
sequences which are random in this sense. For details, see exercise
36. The consistency of Definition R6 is thereby established.
\subsectionbegin{E. Random finite sequences} An argument
was given above to indicate that it is impossible to define
the concept of randomness for finite sequences; any given finite
sequence is as likely as any other. Still, nearly everyone would
agree that the sequence 011101001 is ``more random'' than 101010101,
and even the latter sequence is ``more random'' than 000000000.
Although it is true that truly random sequences will exhibit
locally nonrandom behavior, we would expect such behavior only
in a long finite sequence, not in a short one.
Several ways for defining the randomness of a finite sequence
have been proposed, and only a few of the ideas will be sketched
here. Let us consider only the case of $b$-ary sequences.
Given a $b$-ary sequence $X↓1$, $X↓2$, $\ldotss$, $X↓N$, we can say
that
$$\Pr\biglp S(n)\bigrp \approx p,\qquad\hjust{if }|\nu (N)/N
- p| ≤ 1/\sqrt{N},$$
where $\nu (n)$ is the quantity appearing in Definition
A at the beginning of this section. The above sequence can be
called ``$k$-distributed'' if
$$\Pr(X↓nX↓{n+1} \ldotsm X↓{n+k-1} = x↓1x↓2 \ldotsm x↓k) \approx
1/b↑k$$
for all $b$-ary numbers $x↓1x↓2 \ldotsm x↓k$. $\biglp$Cf.\
Definition D; unfortunately a sequence may be $k$-distributed
by this new definition when it is not $(k - 1)$-distributed.$\bigrp$
%folio 214 galley 8 (C) Addison-Wesley 1978 *
A definition of randomness may now be given analogous
to Definition R1, as follows:
\thbegin Definition Q1. {\sl A $b$-ary sequence of length $N$ is
``random'' if it is $k$-distributed (in the above sense) for all
positive integers $k ≤ \log↓b N$.}
\yyskip According to this definition, for example,
there are 170 nonrandom binary sequences of length 11:
$$\vcenter{\halign{#\qquad⊗#\qquad⊗#\qquad⊗#\cr
00000001111⊗10000000111⊗11000000011⊗11100000001\cr
00000001110⊗10000000110⊗11000000010⊗11100000000\cr
00000001101⊗10000000101⊗11000000001⊗10100000001\cr
00000001011⊗10000000011⊗01000000011⊗01100000001\cr
00000000111\cr}}$$
plus 01010101010 and all sequences with nine or more
zeros, plus all sequences obtained from the preceding sequences by
interchanging ones and zeros.
Similarly, we can formulate a definition for finite sequences
analogous to Definition R6. Let {\bf A} be a set of algorithms,
each of which is a combination selection and choice procedure
that gives a subsequence $\langle X↓{s↓n}\rangle\Rscr$,
as in the proof of Theorem M.
\thbegin Definition Q2. {\sl The $b$-ary sequence
$X↓1$, $X↓2$, $\ldotss$, $X↓N$ is $(n, \epsilon )$-random with respect
to a set of algorithms {\bf A}, if for every subsequence
$X↓{t↓1}$, $X↓{t↓2}$, $\ldotss$, $X↓{t↓m}$ determined by
an algorithm of {\bf A} we have either $m < n$ or
$$\left|{1\over m} \nu↓a(X↓{t↓1}, \ldotss , X↓{t↓m})
- {1\over b} \right| ≤ \epsilon \qquad\hjust{for }0 ≤ a
< b.$$
Here $\nu ↓a(x↓1, \ldotss , x↓m)$ is the number
of $a$'s in the sequence $x↓1$, $\ldotss$, $x↓m$.}
\yyskip (In other words, every sufficiently long subsequence determined
by an algorithm of {\bf A} must be approximately equidistributed.)
The basic idea in this case is to let {\bf A} be a set of ``simple''
algorithms; the number (and the complexity) of the algorithms
in {\bf A} can grow as $N$ grows.
As an example of Definition Q2, let us consider binary sequences,
and let {\bf A} be just the following four algorithms:
\yyskip\textindent{a)}Take the whole sequence.
\textindent{b)}Take alternate terms of the sequence, starting with the
first.
\textindent{c)}Take the terms of the sequence following a zero.
\textindent{d)}Take the terms of the sequence following a one.
\yyskip Now a sequence $X↓1$, $\ldotss$, $X↓8$ is $(4, {1\over
8})$-random if:
\def\\#1{\noindent\hangindent 38pt\hjust to 38pt{#1 }}
\yyskip\\{by (a),}$|{1\over 8}(X↓1 + \cdots + X↓8) - {1\over
2}| ≤ {1\over 8}$, i.e., if there are 3, 4, or 5 ones;
\yskip\\{by (b),}$|{1\over 4}(X↓1 + X↓3 + X↓5 + X↓7)
- {1\over 2}| ≤ {1\over 8}$, i.e., if there are two ones in
odd-numbered positions;
\yskip\\{by (c),}there are three possibilities depending
on how many zeros occupy positions $X↓1$, $\ldotss$, $X↓7$: if there
are 2 or 3 zeros here, there is no condition to test (since
$n = 4$); if there are 4 zeros, they must respectively be followed by two
zeros and two ones; and if there are 5 zeros, they must respectively be followed
by two or three zeros;
\yskip\\{by (d),}we get conditions similar
to those implied by (c).
\yyskip It turns out that only the following binary sequences of length
8 are $(4, {1\over 8})$-random with respect to these rules:
$$\vcenter{\halign{#\qquad⊗#\qquad⊗#\qquad⊗#\cr
00001011⊗00101001⊗01001110⊗01101000\cr
00011010⊗00101100⊗01011011⊗01101100\cr
00011011⊗00110010⊗01011110⊗01101101\cr
00100011⊗00110011⊗01100010⊗01110010\cr
00100110⊗00110110⊗01100011⊗01110110\cr
00100111⊗00111001⊗01100110\cr}}$$
plus those obtained by interchanging 0 and 1 consistently.
It is clear that we could make the set of algorithms so large
that no sequences satisfy the definition, when $n$ and $\epsilon$
are reasonably small. A. N. Kolomogorov has proved that an $(n,
\epsilon )$-random binary sequence {\sl will} always exist,
for any given $N$, if the number of algorithms in {\bf A} does
not exceed
$${1\over 2}e↑{2n\hskip.5pt ε↑2(1-ε)}.\eqno (35)$$
This result is not nearly strong enough to show
that sequences satisfying Definition Q1 will exist, but the
latter can be constructed efficiently using the procedure of
Rees in exercise 3.2.2-21.
Still another interesting approach to a definition of randomness
has been taken by Per Martin-L\"of [{\sl Information and
Control \bf 9} (1966), 602--619]. Given a finite $b$-ary sequence
$X↓1$, $\ldotss$, $X↓N$, let $l(X↓1, \ldotss , X↓N)$ be the length
of the shortest Turing machine program which generates this
sequence. (For a definition of Turing machines, see Chapter
11; alternatively, we could use certain classes of effective
algorithms, such as those defined in exercise 1.1--8.) Then
$l(X↓1, \ldotss , X↓N)$ is a measure of the ``patternlessness''
of the sequence, and we may equate this idea with randomness.
The sequences of length $N$ which maximize $l(X↓1, \ldotss
, X↓N)$ may be called random. (From the standpoint of practical
random-number generation by computer, this is, of course, the
worst definition of ``randomness'' that can be imagined!)
Essentially the same definition of randomness was independently
given by G. Chaitin at about the same time; see {\sl JACM \bf 16}
(1969), 145--159. It is interesting to note that even though
this definition makes no reference to equidistribution properties
as our other definitions have, Martin-L\"of and Chaitin
have proved that random sequences of this type also have the
expected equidistribution properties. In fact, Martin-L\"
of has demonstrated that such sequences satisfy {\sl all} computable
statistical tests for randomness, in an appropriate sense.
For further devopments in the definition of random finite sequences,
cf.\ A. K. Zvonkin and L. A. Levin, {\sl Uspekhi Mat.\ Nauk \bf 25},6
(Nov. 1970), 85--127 [English translation in {\sl Russian Math.\ Surveys \bf25},6
(Nov. 1970), 83--124]; L. A. Levin, {\sl Doklady Akad.\ Nauk SSSR \bf212}
(1973), 548--550.
\subsectionbegin{F. Summary, history, and bibliography}
We have defined several degrees of randomness that a sequence
might possess.
An infinite sequence that is $∞$-distributed satisfies
a great many useful properties which are expected of random
sequences, and there is a rich theory concerning $∞$-distributed
sequences. (The exercises which follow develop several important
properties of $∞$-distributed sequences that have not been mentioned
in the text.) Definition R1 is therefore an appropriate basis
for theoretical studies of randomness.
The concept of an $∞$-distributed $b$-ary sequence was introduced
in 1909 by Emile Borel. He essentially defined the concept of
an $(m, k)$-distributed sequence, and showed that the $b$-ary
representations of almost all real numbers are $(m, k)$-distributed
for all $m$ and $k$. He called such numbers {\sl normal\/\hskip1pt} to
base $b$. An excellent discussion of this topic appears in his
well-known book, {\sl Le\c ons sur la th\'eorie des
fonctions} (2nd ed., 1914), 182--216.
The notion of an $∞$-distributed sequence of {\sl real} numbers,
also called a {\sl completely equidistributed sequence}, was
introduced by N. M. Korobov in {\sl Doklady Akad.\ Nauk SSSR
\bf 62} (1948), 21--22. He and several colleagues developed
the theory of such sequences quite extensively during the 1950s
in a series of papers.
Completely equidistributed sequences were independently studied by Joel N.
Franklin, {\sl Math.\ Comp.\ \bf17} (1963), 28--59, in a paper that
is particularly note\-worthy because it was inspired by the problem of
random-number generation.
The book {\sl Uniform Distribution of
Sequences} by L. Kuipers and H. Niederreiter (New York: Wiley,
1974) is an extraordinarily complete source of information about
the rich mathematical literature concerning $k$-distributed
sequences of all kinds.
We have seen, however, that $∞$-distributed sequences need not
be sufficiently haphazard to qualify completely as ``random.''
Three definitions, R4, R5, and R6, were formulated above to
provide the additional conditions; and Definition R6, in particular,
seems to be an appropriate way to define the concept of an infinite
random sequence. It is a precise, quantitative statement that
may well coincide with the intuitive idea of true randomness.
%folio 216 galley 9 (C) Addison-Wesley 1978 *
Historically, the development of these definitions was primarily
influenced by R. von Mises' quest for a good definition of ``probability.''
In {\sl Math.\ Zeitschrift \bf 5} (1919), 52--99, von Mises proposed
a definition similar in spirit to Definition R5, although stated
too strongly (like our Definition R3) so that no sequences satisfying
the conditions could possibly exist. Many people noticed this
discrepancy, and A. H. Copeland [{\sl Amer.\ J. Math.\ \bf 50}
(1928), 535--552] suggested weakening von Mises' definition
by substituting what he called ``admissible numbers'' (or Bernoulli
sequences). These are equivalent to $∞$-distributed $[\,0,1)$ sequences
in which all entries $U↓n$ have been replaced by 1 if $U↓n <
p$ or by 0 if $U↓n ≥ p$, for a given probability $p$. Thus Copeland
was essentially suggesting a return to Definition R1. Then Abraham
Wald showed that it is not necessary to weaken von Mises' definition
so drastically, and he proposed substituting a countable set
of subsequence rules. In an important paper [{\sl Ergebnisse
eines math.\ Kolloquiums \bf 8} (Vienna, 1937), 38--72], Wald
essentially proved Theorem W, although he made the erroneous
assertion that the sequence constructed by Algorithm W also
satisfies the stronger condition that $\Pr(U↓n \in A) =\hjust{measure
of }A$, for all Lebesgue measurable $A \subset [\,0,1)$. We have
observed that no sequence can satisfy this property.
\def\\{{\sl Zeitschr.\ f.\ math.\ Logik und Grundlagen d.\ Math.\ }}
The concept of ``computability'' was still very much in its
infancy when Wald wrote his paper, and A. Church [{\sl Bull.\ Amer.\ Math.\
Soc.\ \bf 47} (1940), 130--135] showed how the precise notion
of ``effective algorithm'' could be added to Wald's theory to
make his definitions completely rigorous. The extension to Definition
R6 was due essentially to A. N. Kolmogorov [{\sl Sankhy\=a}
series A, {\bf 25} (1963), 369--376], who proposed Definition
Q2 for finite sequences in that same paper.
Another definition of randomness for finite sequences,
somewhere ``between'' Definitions Q1 and Q2,
had been formulated many years earlier by A. S. Besicovitch
[{\sl Math.\ Zeitschrift \bf 39} (1934), 146--156].
The publications of Church and Kolmogorov considered only binary
sequences for which $\Pr(X↓n = 1) = p$ for a given probability
$p$. Our discussion in this section has been slightly
more general, since a $[\,0,1)$ sequence essentially represents
all $p$ at once.
The von Mises--Wald--Church definition has been refined in
yet another interesting way by J. V. Howard, \\ {\bf21} (1975), 215--224.
\def\Pro{\mathop{\overline{\char'120\char'162}}}
\def\Pru{\mathop{\underline{\char'120\char'162}}}
Another important contribution was made by Donald W. Loveland
[\\ {\bf12} (1966), 279--294], who discussed Definitions R4, R5, R6, and
several intermediate concepts. Loveland proved that there are
R5-random sequences which do not satisfy R4, thereby establishing
the need for a stronger definition such as R6. In fact, he defined
a rather simple permutation $\langle f(n)\rangle$ of the nonnegative
integers, and an Algorithm W$↑\prime$ analogous to Algorithm W,
such that $\Pro(U↓{f(n)} ≥ {1\over 2}) - \Pru(U↓{f(n)}
≥ {1\over 2}) ≥ {1\over 2}$ for every R5-random sequence $\langle
U↓n\rangle$ produced by Algorithm W$↑\prime$ when it is given
an infinite set of subsequence rules $\Rscr↓k$.
Although Definition R6 is intuitively much stronger than R4, it is apparently
not a simple matter to prove this rigorously, and for several years it was an
open question whether or not R4 implies R6. Finally
Thomas Herzog and James C. Owings, Jr., discovered how to construct a large
family of sequences that satisfy R4 but not R6. [See \\ {\bf22} (1976),
385--389.]
Kolmogorov wrote another significant paper [{\sl Problemy Pereda\v ci
Informatsii \bf 1} (1965), 3--11] in which he considered the problem of
defining the ``information content'' of a sequence, and this
work led to Chaitin and Martin-L\"of's interesting definition
of finite random sequences via ``patternlessness.'' [See {\sl
IEEE Trans.\ \bf IT-14} (1968), 662--664.]
For a philosophical discussion of random sequences, see K. R. Popper,
{\sl The Logic of Scientific Discovery} (London, 1959), especially
the interesting construction on pp.\ 162--163, which he first
published in 1934.
Further connections between random sequences and recursive function theory
have been explored by D. W. Loveland, {\sl Trans.\ Amer.\ Math.\ Soc.\
\bf125} (1966), 497--510. See also C.-P. Schnorr [{\sl Zeitschr.\
Wahr.\ verw.\ Geb.\ \bf14} (1969), 27--35], who found strong relations
between random sequences and the ``species of measure zero'' defined by
L. E. J. Brouwer in 1919. Schnorr's subsequent
book {\sl Zuf\"alligkeit und Wahrscheinlichkeit} [{\sl Lecture Notes
in Math.\ \bf 218} (Berlin: Springer, 1971)] gives a
detailed treatment of the entire subject of randomness
and makes an excellent introduction to
the ever-growing advanced literature on the topic.
\exbegin{EXERCISES}
\exno 1. [10] Can a periodic sequence be equidistributed?
\exno 2. [10] Consider the periodic binary sequence 0, 0, 1,
1, 0, 0, 1, 1, $\ldotss\,$. Is it 1-distributed? Is it 2-distributed?
Is it 3-distributed?
\exno 3. [M22] Construct a periodic ternary sequence which is
3-distributed.
\trexno 4. [HM22] Let $U↓n = (2↑{\lfloor\lg(n+1)\rfloor}/3)\mod 1$.
What is $\Pr(U↓n < {1\over 2})$?
\exno 5. [HM14] Prove that $\Pr\biglp S(n)\hjust{ and }T(n)\bigrp +
\Pr\biglp S(n)\hjust{ or }T(n)\bigrp = \Pr\biglp S(n)\bigrp + \Pr\biglp
T(n)\bigrp $, for any two statements $S(n)$, $T(n)$, provided
that at least three of the limits exist. For example, if a sequence
is 2-distributed, we would find that
$$\Pr(u↓1 ≤ U↓n < v↓1\hjust{ or }u↓2 ≤ U↓{n+1}
< v↓2)= v↓1 - u↓1 + v↓2 - u↓2 - (v↓1 - u↓1)(v↓2 - u↓2).$$
\exno 6. [HM23] Let $S↓1(n)$, $S↓2(n)$, $\ldots$
be an infinite sequence of statements about mutually disjoint
events; i.e., $S↓i(n)$ and $S↓j(n)$ cannot simultaneously be
true if $i ≠ j$. Assume that $\Pr\biglp S↓j(n)\bigrp$ exists for
each $j ≥ 1$. Show that $\Pru\biglp S↓j(n)\hjust{ is true for some }j
≥ 1\bigrp ≥ \sum ↓{j≥1}\Pr\biglp S↓j(n)\bigrp $, and give
an example to show that equality need not hold.
\exno 7. [HM27] Let $\{S↓{ij}(n)\}$ be a family of statements
such that $\Pr\biglp S↓{ij}(n)\bigrp$ exists for all $i,j≥1$. Assume that for
all $n > 0$, $S↓{ij}(n)$ is true for exactly one pair of integers
$i, j$. If $\sum ↓{i,j≥1}\Pr\biglp S↓{ij}(n)\bigrp = 1$,
does it follow that ``$\Pr\biglp S↓{ij}(n)
\hjust{ is true for some }j ≥ 1\bigrp$'' exists for all $i≥1$, and that it
equals $\sum ↓{j≥1}\Pr\biglp S↓{ij}(n)\bigrp $?
\exno 8. [M15] Prove (13).
\exno 9. [HM20] Prove Lemma E.\xskip [{\sl Hint:} Consider $\sum
↓{1≤j≤m}(y↓{jn} - α)↑2$.]
\trexno 10. [HM22] Where was the fact that $q$ is a multiple of
$m$ used in the proof of Theorem C?
\exno 11. [M10] Use Theorem C to prove that if $\langle U↓n\rangle$
is $∞$-distributed, so is the subsequence $\langle U↓{2n}\rangle$.
\exno 12. [HM20] Show that a $k$-distributed sequence passes
the ``maximum of $k$'' test, in the following sense: $\Pr\biglp
u ≤ \max(U↓n, U↓{n+1}, \ldotss , U↓{n+k-1}) < v\bigrp = v↑k
- u↑k$.
\trexno 13. [HM27] Show that an $∞$-distributed $[\,0,1)$ sequence
passes the ``gap test'' in the following sense: If $ 0 ≤ α
< β ≤ 1$ and $p = β - α$, let $f(0) = 0$, and for $n ≥ 1$
let $f(n)$ be the smallest integer $m > f(n - 1)$ such that
$α ≤ U↓m < β$; then $$\Pr\biglp f(n) - f(n - 1) = k\bigrp\,=\,p(1
- p)↑{k-1}.$$
\exno 14. [HM25] Show that an $∞$-distributed sequence passes
the ``run test'' in the following sense: If $f(0) = 1$ and if,
for $n ≥ 1$, $f(n)$ is the smallest integer $m > f(n - 1)$ such
that $U↓{m-1} > U↓m$, then
$$\Pr\biglp f(n) - f(n - 1) = k\bigrp\,=\,2k/(k + 1)! - 2(k + 1)/(k
+ 2)!.$$
\trexno 15. [HM30] Show that an $∞$-distributed
sequence passes the ``coupon-collector's test'' when there are
only two kinds of coupons, in the following sense: Let $X↓1$,
$X↓2$, $\ldots$ be an $∞$-distributed binary sequence. Let $f(0) =
0$, and for $n ≥ 1$ let $f(n)$ be the smallest integer $m >
f(n - 1)$ such that $\{X↓{f(n-1)+1}, \ldotss , X↓m\}$ is the
set $\{0, 1\}$. Prove that $\Pr\biglp f(n) - f(n - 1) = k\bigrp
\,=\, 2↑{1-k}$, for $k ≥ 2$. (Cf.\ exercise 7.)
\exno 16. [HM38] Does the coupon-collector's
test hold for $∞$-distributed sequences when there are more than
two kinds of coupons? (Cf.\ the previous exercise.)
\exno 17. [HM50] If $r$ is any given rational number, Franklin
has proved that the sequence $\langle r↑n \mod 1\rangle$ is not
2-distributed. But is there any rational number $r$ for which
this sequence is equidistributed? In particular, is the sequence
equidistributed when $r = {3\over 2}$? [Cf.\ K.
Mahler, {\sl Mathematika \bf 4} (1957), 122--124.]
\trexno 18. [HM22] Prove that if $U↓0$, $U↓1$, $\ldots$ is $k$-distributed,
so is the sequence $V↓0$, $V↓1$, $\ldotss $, where $V↓n = \lfloor
nU↓n\rfloor /n$.
\exno 19. [HM35] Consider Definition R4 with ``$∞$-distributed''
replaced by ``1-distributed''. Is there a sequence which satisfies
this weaker definition, but which is not $∞$-distributed? (Is
the weaker definition really weaker?)
\exno 20. [HM47] Does the sequence $\langle\theta↑n\mod 1\rangle$ satisfy
Definition R4 for almost all real numbers $\theta>1$?
%folio 220 galley 10 (C) Addison-Wesley 1978 *
\exno 21. [HM20] Let $S$ be a set and let $\Mscr$
be a collection of subsets of $S$. Suppose that $p$ is a real-valued
function of the sets in $\Mscr$, such that $p(M)$ denotes the probability
that a ``randomly'' chosen element of $S$ lies in $M$. Generalize
Definitions B and D to obtain a good definition of the concept
of a $k$-distributed sequence $\langle Z↓n\rangle$ of elements
of $S$ with respect to the probability distribution $p$.
\trexno 22. [HM30] (Hermann Weyl.)\xskip Show that the $[\,0,1)$ sequence
$\langle U↓n\rangle$ is $k$-distributed if and only if
$$\lim↓{N→∞} {1\over N} \sum ↓{0≤n<N} \exp\biglp
2πi(c↓1U↓n + \cdots + c↓kU↓{n+k-1})\bigrp = 0$$
for every set of integers $c↓1$, $c↓2$, $\ldotss$,
$c↓k$ not all zero.
\exno 23. [M34] Show that a $b$-ary sequence $\langle X↓n\rangle$
is $k$-distributed if and only if all of the sequences $\langle
c↓1X↓n + c↓2X↓{n+1} + \cdots + c↓kX↓{n+k-1}\rangle$ are {\sl
equidistributed}, whenever $c↓1$, $c↓2$, $\ldotss$, $c↓k$ are integers
with $\gcd(c↓1, \ldotss , c↓k) = 1$.
\exno 24. [M25] Show that
a $[\,0,1)$ sequence $\langle U↓n\rangle$ is $k$-distributed if
and only if all of the sequences $\langle c↓1U↓n + c↓2U↓{n+1}
+ \cdots + c↓kU↓{n+k-1}\rangle$ are {\sl equidistributed}, whenever
$c↓1$, $c↓2$, $\ldotss$, $c↓k$ are integers not all zero.
\exno 25. [HM20] A sequence is called a ``white sequence'' if
all serial correlations are zero, i.e., if the equation in Corollary
S is true for {\sl all} $k ≥ 1$. (By Corollary S, an $∞$-distributed
sequence is white.) Show that if a $[\,0,1)$ sequence is equidistributed,
it is white if and only if
$$\lim↓{n→∞} {1\over n} \sum ↓{0≤j<n} \textstyle(U↓j -
{1\over 2})(U↓{j+k} - {1\over 2}) = 0,\qquad\hjust{for all }k ≥ 1.$$
\exno 26. [HM34] (J. Franklin.)\xskip A
white sequence, as defined in the previous exercise, can definitely
fail to be random. Let $U↓0$, $U↓1$, $\ldots$ be an $∞$-distributed
sequence; define the sequence $V↓0$, $V↓1$, $\ldots$ as follows:
$$\vcenter{\halign{$(V↓{2n-1},V↓{2n})=(#)$⊗\qquad if $(U↓{2n-1},U↓{2n})#G,$\cr
U↓{2n-1},U↓{2n}⊗\in\cr
U↓{2n},U↓{2n-1}⊗\notin\cr}}$$
where $G$ is the set $\leftset (x, y)\relv x - {1\over 2}
≤ y ≤ x\hjust{ or }x + {1\over 2} ≤ y\rightset$. Show that\xskip (a) $V↓0$, $V↓1$,
$\ldots$ is equidistributed and white;\xskip (b) $\Pr(V↓n > V↓{n+1})
= {5\over 8}$. (This points out the weakness of the serial correlation
test.)
\exno 27. [HM48] What is the highest
possible value for $\Pr(V↓n > V↓{n+1})$ in an equidistributed,
white sequence? (D. Coppersmith has constructed
such a sequence achieving the value ${7\over 8}$.)
\trexno 28. [HM21] Use the sequence
(11) to construct a $[\,0,1)$ sequence which is 3-distributed,
for which $\Pr(U↓{2n} ≥ {1\over 2}) = {3\over 4}$.
\exno 29. [HM34] Let $X↓0$, $X↓1$, $\ldots$ be a $(2k)$-distributed
binary sequence. Show that
$$\overline{\hjust{Pr}}(X↓{2n} = 0) ≤ {1\over 2} + \left.2k - 1\choose k\right/
2↑{2k}.$$
\trexno 30. [M39] Construct a binary
sequence which is $(2k)$-distributed, and for which
$$\Pr(X↓{2n} = 0) = {1\over 2} + \left.2k-1\choose k\right/2↑{2k}.$$
(Therefore the inequality in the previous exercise is the best possible.)
\exno 31. [M30] Show that $[\,0,1)$ sequences exist which satisfy Definition R5,
yet $\nu↓n/n≥{1\over2}$ for all $n>0$, where $\nu↓n$ is the number of $j<n$ for
which $U↓n<{1\over2}$. (This might be considered a nonrandom property of the
sequence.)
\exno 32. [M24] Given that $\langle
X↓n\rangle$ is a ``random'' $b$-ary sequence according to Definition
R5, and that $\Rscr$ is a computable subsequence rule which specifies
an infinite subsequence $\langle X↓n\rangle \Rscr$, show that
the latter subsequence is not only 1-distributed, it is ``random''
by Definition R5.
\exno 33. [HM22] Let $\langle U↓{r↓n}\rangle$ and
$\langle U↓{s↓n}\rangle$ be infinite disjoint
subsequences of a sequence $\langle U↓n\rangle $. (Thus, $r↓0
< r↓1 < r↓2 < \cdots$ and $s↓0 < s↓1 < s↓2 < \cdots$ are increasing
sequences of integers and $r↓m ≠ s↓n$ for any $m, n$.) Let $\langle
U↓{t↓n}\rangle$ be the combined subsequence, so that $t↓0
< t↓1 < t↓2 < \cdots$ and the set $\{t↓n\} = \{r↓n\} ∪ \{s↓n\}$.
Show that if $\Pr(U↓{r↓n}\in A) = \Pr(U↓{s↓n}\in A)
= p$, then $\Pr(U↓{t↓n}\in A) = p$.
\trexno 34. [M25] Define subsequence rules $\Rscr↓1$, $\Rscr↓2$, $\Rscr↓3$,
$\ldots$ such that Algorithm W can be used with these rules to
give an effective algorithm to construct a $[\,0,1)$ sequence satisfying
Definition R1.
\trexno 35. [HM35] (D. W. Loveland.)\xskip Show that if a binary sequence
$\langle X↓n\rangle$ is R5-random, and if $\langle s↓n\rangle$
is any computable sequence as in Definition R4, we must have
$\overline{\hjust{Pr}}(X↓{s↓n} = 1) ≥ {1\over 2}$ and
$\underline{\hjust{Pr}}(X↓{s↓n} = 1) ≤ {1\over 2}$.
\exno 36. [HM30] Let $\langle X↓n\rangle$ be a binary sequence
which is ``random'' according to Definition R6. Show that the
$[\,0,1)$ sequence $\langle U↓n\rangle$ defined in binary notation
by the scheme
$$\eqalign{U↓0⊗= (0.X↓0)↓2\cr
U↓1⊗=(0.X↓1X↓2)↓2\cr
U↓2⊗=(0.X↓3X↓4X↓5)↓2\cr
U↓3⊗=(0.X↓6X↓7X↓8X↓9)↓2\cr
\noalign{\hjust{$\ldots$}}}$$
is random in the sense of Definition R6.
\exno 37. [M37] (D. Coppersmith.)\xskip
Define a sequence that satisfies Definition R4 but not Definition
R5.\xskip [{\sl Hint:} Consider changing $U↓0$, $U↓1$, $U↓4$, $U↓9$,
$\ldots$ in a truly random sequence.]
\exno 38. [M49] (A. N. Kolmogorov.)\xskip
Given $N$, $n$ and $ε$, what is the smallest number of algorithms
in a set {\bf A} such that no $(n, ε)$-random binary sequences
of length $N$ exist with respect to {\bf A}? (If exact formulas
cannot be given, can asymptotic formulas be found? The point
of this problem is to discover how close the bound (35) comes
to being ``best possible.'')
\exno 39. [HM45] (W. M. Schmidt.)\xskip Let $U↓n$ be a $[\,0, 1)$ sequence,
and let $\nu ↓n(u)$ be the number of nonnegative integers $j
≤ n$ such that $0 ≤ U↓j < u$. Prove that there is a positive
constant $c$ such that, for any $N$ and for any $[\,0, 1)$ sequence
$\langle U↓n\rangle $, we have
$$|\nu ↓n(u) - un| > c \ln N$$
for some $n$ and $u$ with $0 ≤ n < N$, $0 ≤ u< 1$.
$\biglp$In other words, no $[\,0, 1)$
sequence can be {\sl too} equidistributed.$\bigrp$
\trexno 40. [16] (I. J. Good.)\xskip Can a valid table of random digits
contain just one misprint?
\vfill\eject